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ABSTRACT 


In  this  paper,  the  importance  of  maintaining  the  quality  of  information  in  the  Un¬ 
listed  Master  File  will  be  established.  The  Enlisted  Master  File  is  generated  by  the  Navy 
Enlisted  System,  one  of  many  applications  which  process  data  for  the  Manpower,  Per¬ 
sonnel.  and  Training  community.  To  clarify  what  technologies,  policies,  and  procedures 
can  contribute  to  improved  data  quality,  a  framework  for  classifying  these  initiatives  is 
developed.  The  data  quality  control  environment  of  the  Navy  Enlisted  System  is  then 
evaluated  with  respect  to  that  framework.  Two  deficiencies  in  the  data  quality  control 
environment  are  identified.  One  is  the  lack  of  techniques  to  measure  the  quality  of  data 
in  the  Enlisted  Master  File,  and  the  otner  is  the  lack  of  comprehensive  plans  for  data 
quality  control  for  the  data  base  which  will  be  a  successor  to  the  master  file.  A  tech¬ 
nique  for  assessing  data  quality  is  then  tested,  but  its  application  to  the  Navy  Enlisted 
System  was  not  successful  in  this  limited  study.  Technologies  which  could  contribute  to 
enhanced  data  quality  in  the  environment  of  the  future  are  discussed,  and  a  plan  for 
actively  managing  data  quality  is  proposed.  In  closing,  specific  recommendations  for 
improving  the  current  data  quality  control  environment  in  the  Total  Force  Information 
Systems  Management  Department  are  presented. 
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1.  INTRODUCTION 


The  Navy  must  properly  manage  its  personnel  to  successfully  accomplish  its  objec¬ 
tives.  In  the  1990s.  a  time  of  shrinking  fiscal  resources,  effective  strategic  planning  for 
manpower,  personnel,  and  training  is  absolutely  essential.  Reliable  assessments  of 
overall  personnel  strength  must  be  provided  to  develop  recruiting  and  accession  plans. 
Accurate  information  regarding  the  skills  of  the  current  force  is  needed  to  set  training 
quotas.  Mobilization  readiness  and  the  ability  to  recall  large  numbers  of  personnel  must 
be  maintained.  In  the  technological  world  in  which  the  Navy  operates,  it  must  make 
effective  use  of  each  of  its  specialists.  Tracking  their  qualifications  is  critical.  Compre¬ 
hensive  information  about  language  competency  may  be  pivotal  in  a  crisis  situation. 
Incorrect  security  clearance  information  cannot  be  tolerated.  Benefit  and  educational 
programs  must  be  managed  to  a  man.  Whether  measuring  the  personnel  profile  of  the 
entire  Navy  for  planning,  or  tracking  the  career  of  an  individual  sailor,  accurate  per¬ 
sonnel  data  is  an  absolute  necessity. 

The  Navy's  managers  of  this  manpower,  personnel,  and  training  information  are 
well  aware  of  the  importance  of  their  role  in  meeting  broader  service-wide  strategies. 
The  goals  of  the  customer  service  center  they  operate  are  depicted  in  figure  1.  In  sup¬ 
port  of  their  customers: 

All  data/information  initiatives  are  intended  to  meet  the  overall  goal  of  providing 
timely  and  accurate  information  support  to  users  and  decision  makers  in  order  to 
increase  the  Navy's  fleet  readiness  and  mobilization  ability.  (CIRMP.  19S9.  p.3-3 1 

However,  the  information  managers  have  not  yet  devised  programs  which  provide  a  level 
of  data  quality  which  meets  their  customers'  needs.  T  his  is  due  in  large  part  to  the  age 
and  complexity  of  the  current  systems. 

A.  THE  QUALITY  OF  PERSONNEL  DATA  NEEDS  IMPROVEMENT 

The  Navy  currently  collects  its  enlisted  personnel  data  through  an  intricate  network 
of  interfacing  systems.  Some  data  are  gathered  interactively,  with  data  validation  per¬ 
formed  as  they  are  input.  Other  data  are  transmitted  through  the  message  system  and 
consolidated  by  a  central  facility.  The  data  are  then  processed  by  the  Navy  Unlisted 
System  (NHS)l.  a  personnel  accounting  application  developed  in  1973.  NTS  is  run  in  a 

I  To  aid  in  readability,  Appendix  A  contains  a  list  of  aeronvms  with  then  long  names. 
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MPT  FLEET  SUPPORT 


Figure  1.  MPT  Customer  Support  Responsibilities:  This  figure  was  taken  from 
a  briefing  provided  by  OP- 16. 
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FLEET  READINESS 


batch  environment,  with  updates  five  times  a  week.  Various  tapes,  reports,  and  sum¬ 
maries  are  produced  for  use  by  managers  of  Manpower,  Personnel,  and  Training  (MPT) 
organizations.  Individual  sailors'  records  are  used  by  assignment  officers,  Chief  Petty 
Officer  Selection  boards,  and  a  host  of  others.  Though  the  quality  of  the  data  in  the 
Enlisted  Master  File  (EMF)  which  NES  produces  has  improved  over  time,  it  is  still  an 
issue.  In  fact,  during  a  1989  conference  of  MPT  functional  users,  the  first  concern 
identified  was  the  "Accuracy  and  timeliness  of'  data,  (and]  responses  to  requests  &  re¬ 
ports."  (CIRMP,  19S9,  Figure  3-01,  p.3-6) 

This  iequirement  tor  better  data  is  not  without  precedent  elsewhere.  The  govern¬ 
ment  has  passed  legislation  which  requires  that  stored  data  be  maintained: 

The  Privacy  Act  of  1974,  along  with  other  federal  regulations,  has  firmly  established 
the  importance  of  maintaining  accurate,  complete,  and  unambiguous  information  in 
computerized  record  systems.  (Laudon,  1986,  p.4) 

In  1983,  data  integrity  and  quality  was  not  ranked  in  a  study  of  information  profes¬ 
sionals'  top  concerns.  However,  by  19S6  it  ranked  22nd.  higher  than  such  issues  as  de¬ 
cision  support  systems,  computer  graphics,  and  relational  data  base  management 
systems  (Branehcau,  1987,  p.23).  Business  managers  too.  feel  that  stored  data  need  to 
be  accurate  "Given  a  choice,  managers  have  a  strong  preference  for  improvement  in 
quality  of  information  over  an  increase  in  quantity."  (Davis,  I9S5,  p.215)  Those  in  the 
field  of  information  science  have  begun  research  to  measure  the  data  quality  problem. 
Mahmoud  and  Rice  conducted  a  study  of  database  vendors,  and  found  the  overall 
quality  of  the  data  they  provided  to  be  wanting  in  several  areas,  particularly  in  how 
vendors  actually  check  accuracy  and  deal  with  outliers.  They  concluded  that  "Database 
accuracy  is  an  important  area  of  study  because  successful  planning  and  business  deci¬ 
sions  depend  on  accurate  forecasts,  which,  in  turn,  depend  on  accurate  data."  (1988. 
p.249)  This  conclusion  mirrors  that  reached  by  the  MPT  functional  users. 

One  important  initiative  which  the  MPT  Information  Resource  (IR)  managers  feel 
will  improve  the  quality  of  EMF  data  is  the  transition  of  the  data  structure  from  a  Hat 
master  file  to  a  data  base.  While  moving  to  a  data  base  technology  will  certainly  facili¬ 
tate  data  sharing  and  eliminate  redundancy,  it  will  not  resolve  all  problems  with  data 
quality.  Continued  vigilance  will  be  necessary.  In  the  early  1980s,  Brodie  acknowledged 
that  "Database  reliability  and  integrity  are  poorly  understood.  In  fact,  data  quality 
maintenance  is  a  more  severe  problem  than  program  reliability."  (1980,  p.246)  T  his  issue 
will  continue  to  be  of  importance  throughout  the  1990s,  as  Martin  described: 


3 


We  can  draw  a  picture  of  the  computing  facilities  of  a  typical  future  corporation. 
There  will  he  one  or  more  large  computer  centers--in  many  cases  about  the  same 
number  as  there  are  today  but  with  faster  computers.  These  centers  will  be  inter¬ 
connected  by  tel  .communications  and  will  be  jointly  on  line  to  most  parts  of  the 
corporation.  They  will  perform  those  computing  operations  which  still  benefit  from 
centralization  rather  than  distribution,  for  example,  large  number-crunching  oper¬ 
ations.  large-scale  print  runs  or  printing  needing  special  equipment,  maintenance  of 
files  which  are  by  their  nature  centralized,  running  old  centralized  applications  which 
have  not  yet  been  converted  to  distributed  form,  and  (particularly  important)  the 
maintenance  of  corporate  data  bases.  (1981,  p.23)  (emphasis  Martin's  and  mine) 

Far  from  solving  all  data  quality  problems,  data  base  technology  may  only  exacerbate 
them: 

The  need  for  accurate  and  complete  data  increases  as  more  uses  arc  made  of  those 
data.  An  accurate  and  complete  data  element  in  a  dedicated  system  only  affects  that 
system,  but  in  an  environment  where  multiple  users  use  the  same  data,  the  problem 
can  be  much  more  acute.  The  advantage  of  data  base  can  only  be  achieved  when 
the  integrity  of  the  data  base  can  be  ensured.  (Perry.  1983,  p.50) 

Those  responsible  for  MPT  Information  Resources  Management  (IRM)  must  keep  in 
mind  that  "The  rapid  evolution  of  database  concepts  has  been  accompanied  by  the  de¬ 
velopment  of  increasingly  complex  information  systems  with  correspondingly  complex 
data  quality  problems."  (Brodie,  19S0,  p.253)  Unfortunately,  there  is  not  a  great  deal  of 
research  about  how  to  improve  the  quality  of  the  contents  of  a  master  file  or  data  base. 
Most  of  the  research  has  focused  on  how  to  ensure  that  data  is  not  corrupted  during  the 
collection,  transmission,  and  storage  processes.  MPT  IRM  managers  are  well  aware  of 
these  issues,  and  have  adequate  resources  to  resolve  them.  Technical  expertise  is  avail¬ 
able  from  contractors.  While  ensuring  the  integrity  of  these  processes  is  an  important 
part  of  a  data  quality  control  program,  MPT  IRM  managers  must  also  consider  the 
timeliness,  completeness,  and  accuracy  of  the  actual  data  values. 

This  thesis  will  survey  the  current  state  of  data  quality  control  for  the  NFS  and 
provide  recommendations  for  its  improvement.  This  will  be  accomplished  by  building  a 
fr umework  for  classifying  data  quality  initiatives,  by  describing  where  various  techniques 
fit  into  that  framework,  and  by  applying  the  framework  to  the  NFS  environment.  From 
that  exercise,  two  major  deficiencies  in  the  data  quality  control  environment  of  NFS  will 
be  identified,  and  methods  for  eliminating  these  deficiencies  will  be  evaluated.  Lastly, 
recommendations  for  improving  data  quality  control  will  be  proposed. 

To  establish  the  relative  importance  of  maintaining  quality  data,  an  issue  easily  ig¬ 
nored  by  managers  in  a  world  of  competing  priorities,  this  chapter  depicts  how  the  need 
for  improved  data  quality  has  permeated  the  strategic  planning  of  the  IRM  organization 
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responsible  for  the  NES.  Though  the  issue  of  data  quality  is  addressed  in  numerous 
ways,  there  is  no  cohesive  plan  for  ensuring  that  quality  will  improve.  This  is  not  to  say 
that  nothing  has  been  done,  many  initiatives  undertaken  by  MPT  1R  managers  have 
improved  data  quality.  What  is  missing  is  an  overall  plan  that  has  as  its  primary  goal, 
improved  EMF  quality.  Lastly,  this  chapter  describes  why  the  management  of  the  data 
quality  control  program  must  continue  to  be  overseen  by  the  IR  managers. 

In  Chapter  II,  the  framework  for  classifying  data  quality  initiatives  is  developed. 
This  framework  can  be  used  to  assess  the  current  state  of  data  quality  control  in  an  or¬ 
ganization,  and  will  be  applied  specifically  to  the  NFS  environment.  To  make  the  dis¬ 
cussion  of  the  framework  more  meaningful,  basic  terms  are  defined  and  the  evolution 
of  the  framework  is  discussed. 

In  Chapter  III,  two  deficiencies  in  NES  data  quality  management  arc  identified,  by 
surveying  the  organizational  controls  present  with  respect  to  the  framework  established 
in  Chapter  II.  These  deficiencies  are  the  inability  to  accurately  assess  the  data  quality 
of  the  EMF,  and  the  need  for  enhanced  data  management  when  NES  transitions  to  a 
new  system.  Resolving  these  deficiencies  is  by  no  means  trivial.  In  order  to  establish 
some  perspective  on  the  scope  of  the  challenge,  the  organizational  environment  and 
software  characteristics  of  NES  are  described. 

In  Chapter  IV,  the  problem  of  assessing  data  quality  in  the  EMF  is  addressed.  An 
approach  which  uses  statistical  decision  theory  is  tested  (Morey,  19S2),  and  difficulties 
in  implementing  the  technique  are  explored.  An  integer  programming  model  which  uses 
these  assessments  of  data  quality,  along  with  a  number  of  other  parameters,  to  allocate 
resources  to  data  maintenance  techniques  (Ballou  and  Kumar. Tayi,  1989),  is  briefly  dis¬ 
cussed. 

In  Chapter  V,  a  plan  for  managing  the  quality  of  the  data  in  the  new  personnel  data 
base  is  proposed.  This  plan  is  a  general  one  which  could  be  applied  to  any  new  system 
which  is  data  intensive. 

In  the  last  chapter  of  this  thesis,  specific  recommendations  for  better  managing  the 
data  quality  of  the  EMF  are  summarized. 

B.  MPT  IRM  PLANNING  EMPHASIZES  IMPROVED  DATA  QUALITY 

Managers  of  the  MPT  IRM  function  realize  that  improved  data  quality  is  an  im¬ 
portant  strategic  objective,  both  for  the  MPT  business  and  the  IRM  organization. 
References  to  its  crucial  role  are  abundant  in  planning  documents.  The  paragraphs  be¬ 
low  point  out  how  essential  improved  data  quality  is.  and  how  the  MPT  IR  managers 


have  addressed  that  concern.  While  no  specific  plan  for  improving  data  quality  has  been 
developed,  many  initiatives  list  better  data  quality  as  a  potential  benefit.  Since  the  MPT 
community  is  extensive,  two  lead  organizations  have  been  tasked  with  acting  as  the 
Chief  of  Naval  Operation's  (CNO)  MPT  IRM  agent. 

In  order  to  clarify  how  MPT  IRM  strategies  are  conceived  and  implemented,  the 
relationship  of  the  lead  organizations  is  described.  The  CNO's  division  for  Total  Force 
Information  Resources  and  Systems  Management  (OP- 16)  sets  policy  regarding  what 
IRM  issues  the  MPT  community  will  pursue.  The  CNO's  organization  is  referred  to  as 
OPNAV,  and  its  divisions  are  called  OP  codes.  The  Total  Force  Information  Systems 
Management  Department  of  the  Naval  Military  Personnel  Command  (NMPC-16),  who 
works  for  the  Chief  of  Naval  Personnel  (CNP),  must  implement  this  policy.  The 
OPNAV  and  NMPC  houses  of  the  organization  work  together  very  closely,  and  the  head 
of  these  organizations  is  the  same  individual.  For  more  information  on  the  various  roles 
of  these  organizations  and  others  see  the  "Manpower,  Personnel  and  Training  (MPT) 
Information  Resources  Management  (IRM)  Program"  (OPNAV  Instruction  5230.22, 
19S6)  or  The  MPT  Information  Resources  Management  Strategy,  Volume  I:  Executive 
Overview  (MPT  IRM,  Volume  I,  19S8,  p.C-1). 

OP-16/NMPC-16  take  their  guidance  from  the  Navy's  overall  information  manage¬ 
ment  policy.  This  IRM  policy  is  spelled  out  in  the  Secretary  of  the  Navy  's  (SECNAV) 
instruction  entitled  "Department  of  the  Navy  (DON)  Strategic  Plan  for  Managing  In¬ 
formation  and  Related  Resources  (IRSTRATPLAN)"  (SECNAV  Instruction  5230.10. 
1987).  In  addition,  the  program  guidance  and  reporting  requirements  are  detailed  in 
another  SECNAV  Instruction  called  "Information  Resources  (IR)  Program  Planning" 
(SECNAV  Instruction  5230.VA,  1985).  To  support  IR  Program  Planning  certain  activ¬ 
ities  must  produce  a  Component  Information  Resources  Management  Plan  (CIRMP). 
As  the  MPT  IRM  agent  OP-16/NMPC-I6  produce  the  CIRMP. 

The  CIRMP  provides  an  overview  of  what  the  MPT  IRM  organization  is  doing  to 
manage  information  as  a  strategic  resource.  Improved  quality  is  a  recurrent  theme  in 
these  plans.  One  of  16  MPT  IRM  long  range  strategies  is  to  "Use  IRM  principles 
throughout  the  MPT  community  to  insure  quality  and  valid  data."  (CIRMP,  1989, 
p.1-17)  This  is  to  support  one  of  29  major  MPT  business  initiatives  which  is  to  "Im¬ 
prove  Information  Quality  and  Timeliness."  (CIRMP,  1989,  Figure  1-09,  p.  1-16) 

One  of  CNP's  goals  is  to  make  better  use  of  resources  across  the  board.  To  that 
end,  he  has  introduced  Total  Quality  Management  (TQM)  to  his  organization.  The 
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CIRMP  shows  that  the  MPT  IRM  community  wants  TQM  principles  to  influence  the 
quality  of  information  products: 

Traditionally,  product  quality  (whether  manufactured  products  or  information  pro¬ 
ducts)  had  been  addressed  reactively  with  programs  like  Quality  Assurance.  This 
type  of  strategy  focuses  on  product  improvement  through  inspection  and  error  de¬ 
tection  after  completion.  The  TQM  approach  is  "proactive:"  it  focuses  on  the 
achievement  of  product  quality  through  the  continuous  improvement  of  all  the 
component  processes  which  in  their  totalitv,  determine  the  qualitv  of  the  product. 
(CIR.MP,  1989,  p.3-12) 

The  CIRMP  is  replete  with  these  references  to  TQM  and  the  philosophies  it  espouses. 
It  also  relates  TQM  initiatives  to  data  quality  initiatives.  "Data  standards,  the  Data 
Quality  Assurance  Program,  and  the  Total  Quality  Management  (TQM)  process  provide 
the  framework  for  improving  and  maintaining  the  quality  and  integrity  of  MPT  data." 
(CIRMP,  1989,  p.2-4)  However,  throughout  the  document  no  plan  for  using  specific 
TQM  initiatives  to  improve  data  quality  is  set  forth. 

The  data/information  initiatives  mentioned  above,  must  be  implemented  in  an  en¬ 
vironment  filled  with  many  constraints.  In  fact,  "Manage[ing]  IRM  in  a  fiscally  con¬ 
strained  environment"  is  listed  as  one  of  eight  key  IRM  directions  for  CNP's 
organization.  (CIRMP,  19S9,  p.3-1)  Three  of  the  11  assumptions  and  constraints  listed 
among  CNP's  chief  directions  and  trends  are  related  to  the  increasingly  difficult  funding 
environment.  These  are  that  "There  will  be  increasing  competition  for  less  personnel  and 
funding  resources,"  that  "The  role  of  management  will  increase  in  order  to  attempt  to 
effect  savings  in  personnel  and  money,"  and  that  "There  will  be  more  and  more  'micro- 
management'  from  senior  managers  and  organizations,  including  OSD  [Office  of  the 
Secretary  of  Defense],  the  Office  of  Management  and  Budget,  and  Congress."  (CIRMP, 
1989,  p.3-2) 

What  is  the  main  point  of  this  confusing  conglomeration  of  issues,  goals,  and  strat¬ 
egies?  It  is  that  data  quality  has  consistently  been  a  concern  of  managers,  and  that  a 
comprehensive  plan  for  data  quality  improvement  must  be  developed.  Currently,  MPT 
IRM  managers  have  undertaken  many  initiatives  to  improve  data  quality,  but  an  overall 
data  quality  control  program  has  not  been  developed.  As  the  defense  budget  continues 
to  shrink  throughout  the  1990s,  military  managers  must  be  able  to  do  more  with  less. 
If  improved  data  quality  will  facilitate  this,  it  must  remain  a  top  priority. 
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C.  NMPC-16  MUST  TAKE  A  LEAD  ROLE  IN  DATA  QUALITY 

While  information  managers,  functional  users,  and  systems  auditors  all  play  a  role 
in  ensuring  that  quality  data  is  maintained,  the  lead  role  must  be  taken  by  1R  managers. 
Within  the  MPT  1RM  community,  there  is  a  strong  trend  towards  expanding  user  re¬ 
sponsibility  for  data  quality.  This  trend  is  fine,  as  long  as  NMPC-16  realizes  that  they 
must  continue  to  take  the  lead,  by  providing  the  users  with  the  methods  necessary  to 
assess  and  improve  the  quality  of  the  data  for  which  they  are  responsible.  Specifically, 
NMPC-16  cannot  afford  to  assume  that  the  data  maintenance  function  will  be  shoul¬ 
dered  entirely  by  users  in  the  future.  The  paragraphs  below  explain  current  trends  and 
detail  why  NMPC-16  must  continue  to  plan  and  budget  for  a  data  maintenance  activity. 

In  recent  years,  with  the  advent  of  on-line  systems  and  end-user  computing,  much 
discussion  has  focused  on  data  quality  and  the  role  of  the  user.  The  trend  has  been  to 
shift  some  of  the  responsibility  for  maintaining  data  to  users.  This  phenomenon  has 
taken  place  because  the  user  now  has  more  access  to  the  data  base,  is  more  familiar  with 
it  and  what  it  means,  and  probably  can  make  a  better  determination  of  its  veracity. 
However,  data  maintenance  will  remain  a  shared  concern: 

The  integrity  of  the  contents  of  the  data  base  is  the  joint  responsibility  of  the  users 
and  the  data  base  administrator.  The  data  base  administrator  is  concerned  more 
about  the  integrity  of  the  structure  and  the  physical  records,  while  the  users  are 
concerned  about  the  contents  or  values  contained  in  the  data  base.  (1983,  p.9 1 ) 

Perry  further  elaborates  on  this  concept,  explaining  that  the  tasks  listed  below,  are  those 
which  will  help  to  maintain  data  quality.  IR  managers  must: 

1.  Identify  the  method  of  ensuring  the  completeness  of  the  physical  records  in  the 
data  base. 

2.  Determine  the  method  of  ensuring  the  completeness  of  the  logical  structure  of  the 
data  base  (i.c.  schema). 

3.  Determine  which  users  have  responsibility  for  the  integrity  of  which  segments  of 
the  data  base. 

4.  Develop  methods  to  enable  those  users  to  perform  their  data  integrity  responsibil¬ 
ities. 

5.  Determine  at  what  times  the  integrity  of  the  data  base  will  be  verified,  and  assure 
there  arc  adequate  backup  data  between  periods  of  proven  data  integrity.  (Perry, 
1983,  p.9l) 

What  is  interesting  about  Perry's  approach  is  that  it  doesn't  abandon  users  to  maintain 
data  quality  without  the  support  of  the  information  managers.  To  a  certain  degree, 
NMPC-16  has  started  to  implement  a  user-oriented  strategy  for  the  EMI  's  data  main- 
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tenance.  The  data  elements  and  their  characteristics  have  been  analyzed  and  standard¬ 
ized  for  inclusion  in  a  Data  Dictionary/Directory  System  (DD/DS).  Along  with  this, 
responsibility  for  each  data  clement  has  been  assigned  to  a  functional  manager.  Some 
rejected  transactions  are  sent  back  to  the  input  source  for  correction.  However, 
W1PC-16  has  no  specific  plans  to  develop  quality  assessment  and  maintenance  tools  for 
use  by  functional  managers.  NMPC-16  should  not  abandon  its  control  of  corporate 
data  quality.  While  the  users  can  and  should  play  an  essential  role,  the  \1PT  Chief  In¬ 
formation  Officer  should  ultimately  be  accountable  for  the  quality  of  data  in  the  master 
files  or  data  bases. 

Auditors  have  also  played  a  role  in  the  maintenance  of  data  quality.  They  arc  re¬ 
sponsible  for  devising  methods  to  evaluate  the  accuracy  of  stored  data,  perhaps  because 
information  systems  professionals  did  not  build  these  into  systems  in  the  first  place. 
There  is  an  entire  profession  dedicated  to  auditing  Electronic  Data  Processing  systems, 
and  their  skills  need  to  tapped  by  information  systems  managers.  In  addition,  informa¬ 
tion  systems  managers  should  begin  to  assess  the  quality  of  the  data  in  their  own  sys¬ 
tems.  This  is  nothing  more  or  less  than  good  management.  Auditors  are  fine,  but  each 
system  must  have  an  established  set  of  data  quality  control  techniques  to  be  applied  to 
the  system  regularly,  accomplishing  the  following  objectives: 

Internal  control  accomplishes  three  major  objectives.  First,  the  "methodology"  is 
designed  to  insure  [ensure]  that  the  accounting  system  provides  accurate,  complete, 
reliable  and  up-to-date  information  for  making  of  management  decisions.  Second, 
it  is  intended  to  insure  [ensure]  compliance  with  policy  directives,  and  legal  require¬ 
ments.  And  finally  it  protects  the  organization  from  carelessness,  inefficiency  and 
outright  fraud.  (Neumann,  1977,  p.ll) 

At  this  point,  no  regular  auditing  is  done  on  NES  data.  In  the  future.  NMPC-16  man¬ 
agers  would  be  wise  to  establish  internal  controls  for  data  quality. 

D.  CHAPTER  SUMMARY 

In  this  chapter,  three  central  themes  were  developed.  The  first  theme  established  the 
importance  of  improving  the  quality  of  data  in  the  EMF.  The  fact  that  the  need  for 
improved  data  quality  ranks  high  among  all  concerns  of  managers  and  users  of  Auto¬ 
mated  Data  Processing  (ADP)  systems  was  discussed.  The  discussion  closed  with  an 
outline  of  how  this  thesis  would  attack  the  problem  of  improving  the  quality  of  EMF 
data.  Simply  stated,  the  strategy  for  achieving  improved  EMF  quality  is  as  follows:  a 
framework  for  classifying  data  quality  improvement  techniques  will  be  developed  and 
applied  to  the  NES  environment,  holes  in  the  application  of  that  framework  will  be 
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noted,  methods  of  plugging  those  holes  will  be  proposed,  and  recommendations  will  be 
summarized.  The  second  theme  of  this  chapter  emphasized  that  organizational  objec¬ 
tives  acknowledged  the  importance  of  accurate  personnel  data  as  an  MPT  strategic  re¬ 
source.  However,  it  pointed  out  that  in  spite  of  stated  goals  and  objectives,  no  explicit 
program  to  provide  overall  coordination  for  controlling  data  quality  has  been  developed. 
Many  MPT  IRM  initiatives  have  contributed  to  data  quality,  and  the  survey  approach 
used  in  this  thesis  will  document  those  which  impacted  on  NTS.  The  last  theme  in  this 
chapter  explored  the  trend  to  place  responsibility  for  data  quality  on  end  users  or  audi¬ 
tors.  This  discussion  served  to  dissuade  MPT  IR  managers  from  relying  too  heavily  on 
data  base  technology  and  end  users  to  completely  resolve  their  corporate  data  mainte¬ 
nance  problems. 
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II.  DATA  QUALITY  IMPROVEMENT 

This  chapter  develops  a  framework  by  which  data  quality  improvement  techniques 
can  be  categorized  and  managed.  To  enhance  the  discussion  regarding  how  this  frame¬ 
work  was  devised,  definitions  and  background  that  explain  what  data  quality  is  and  how 
it  is  maintained  will  be  provided.  Since  the  field  of  data  quality  and  its  maintenance  does 
not  seem  to  be  as  well-documented  or  as  clearly  defined  as  that  of  software  quality  and 
its  maintenance,  parallels  between  the  two  will  be  drawn.  This  provides  a  basis  for 
comparison,  pointing  out  that  data  quality  can  be  further  defined,  quantified,  and  im¬ 
proved.  The  framework  for  classifying  Data  Quality  Initiatives  will  be  u>ed  in  the  next 
chapter  to  survey  data  quality  control  with  respect  to  the  L:MF. 

A.  WHAT  IS  DATA  QUALITY? 

Data  quality  is  not  a  term  which  has  a  standard  definition  among  information  sys¬ 
tems  professionals.  Sometimes  the  term  data  integrity  is  used  in  place  of  data  quality, 
sometimes  it  is  used  to  mean  something  different.  Date  says  that  "the  term  integrity' 
refers  to  the  accuracy  or  correctness  of  data  in  the  database."  He  further  explains  that 
"Many  systems  that  claim  to  provide  data  integrity  are  actually  using  the  term  to  mean 
concurrency  control  instead."  (1986,  p.444)  (Date's  emphasis)  Weber  says  "It  [data 
integrity]  is  a  state  implying  data  has  certain  attributes:  completeness,  soundness,  purity, 
and  veracity."  (1982,  p.8)  Later,  he  classifies  data  quality  control  as  one  of  six  elements 
necessary  to  maintain  data  base  integrity: 

To  maintain  the  integrity  of  the  database,  the  database  administrator  must  under¬ 
take  six  control  measures:  (a)  definition  control,  (b)  existence  control,  (c)  access 
control,  (d)  update  control,  (e)  concurrence  control,  and  (0  qualitv  control.  (Weber, 
1982,  p.170) 

He  implies  that  data  quality  is  but  an  element  of  overall  data  integrity,  and  along  with 
Date,  says  that  concurrency  control  is  a  component  data  integrity,  but  not  the  whole 
picture.  When  .Martin  addresses  data  integrity,  he  discusses  issues  such  as  consistency, 
locks,  conflict  analysis,  transaction  loss,  and  deadlocks  (1981,  pp. 287-306).  He  does  not 
mention  accuracy  or  completeness  of  information.  Since  none  of  the  researcher's  de¬ 
scriptions  precisely  capture  the  subject  of  this  study,  a  working  definition  of  data  quality 
is  proposed.  When  the  term  data  quality  is  used  here,  it  means  the  degree  to  which 
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stored  data  ^-present  actual  events  as  they  took  place,  or  the  degree  to  which  stored  data 
record  facts  from  a  definitive  source. 

B.  COMPONENTS  OF  DATA  QUALITY 

Various  attributes  of  data  quality  exist,  but  which  ones  are  commonly  identified? 
To  determine  this,  a  much  simplified  version  of  the  approach  taken  by  McCall  and  as¬ 
sociates  with  respect  to  software  quality  components  is  used.  By  compiling  many  re¬ 
searcher's  definitions  of  software  quality  components  and  comparing  and  categorizing 
them,  McCall  and  associates  identified  11  characteristics  of  software  quality.  These  in¬ 
cluded:  maintainability,  flexibility,  testability,  portability,  reusability,  interoperability, 
correctness,  reliability,  efficiency,  integrity,  and  usability  (1977,  p.3-5).  With  respect  to 
data  quality,  Laudon  used  the  three  components  record  completeness,  record  inaccuracy, 
and  record  ambiguity  to  measure  the  quality  of  records  in  Criminal-History  Systems 
(19S6.  p.6).  Ballou  and  Pazer's  model  for  assessing  the  quality  of  data  and  processes  in 
information  systems  addressed: 

Accuracy  (the  recorded  value  is  in  conformity  with  the  actual  value),  timeliness  (the 
recorded  value  is  not  out  of  date),  completeness  (all  values  for  a  certain  variable  are 
recorded),  and  consistency  (the  representation  of  the  data  is  the  same  in  all  cases). 
(1985,  p.  153) 

Weber  said  that  completeness,  soundness,  purity,  and  veracity  are  components  of  data 
quality  (1982,  p.8).  Date  mentioned  the  elements  accuracy  and  correctness  (1986. 
p.444).  A  Chief  of  Naval  Operation's  publication,  the  MPT  IR.M  Data  Quality  Guideline 
names  the  components  accuracy,  timeliness,  and  completencrs  (DCNO,  I9SS).  In 
Table  1  the  occurrence  of  these  elements  is  summarized. 

Components  which  appear  most  often  include:  completeness,  timeliness,  and  accu¬ 
racy.  Completeness  is  the  easiest  component  of  data  quality  to  determine.  It  is  simply 
a  measure  of  whether  all  data  which  should  have  been  recorded,  were  recorded.  There 
can  be  some  subtleties  involved  when  measuring  completeness.  If  the  field  or  attribute 
being  measured  is  an  optional  one,  it  may  be  impossible  to  determine  whether  it  was 
intentionally  left  blank.  Accuracy  is  difficult  to  validate,  by  any  means  other  than  a 
manual  record  check.  Inputs  may  be  a  valid  entry  or  code,  passing  all  edits  and  not 
creating  an  error,  and  still  be  inaccurate.  Another  attribute  of  data  quality  is  timeliness, 
this  is  usually  defined  as  the  length  of  time  it  takes  for  an  actual  event  or  fact  to  be  re¬ 
corded.  For  example,  if  a  sailor  enlists  today,  how  long  does  it  take  to  record  facts  such 
as  his  name  and  date  of  birth?  Later  when  he  is  advanced,  how  long  does  it  take  to  rc- 
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Table  1.  DATA  QUALITY  COMPONENTS 


COMPONENT 

LAUDON 

PAZER  & 
BALLOU 

DAVIS 

WEBER 

DATE 

NAVY 

ACCURACY 

X 

n 

wm 

x1 

X 

COMPLETENESS 

X 

wm 

wm 

X 

X 

PRECISENESS 

X 

X 

TIMELINESS 

X 

X 

CONSISTENCY 

X 

CORRECTNESS 

X 

SOUNDNESS 

X 

PURITY 

X 

VERACITY 

X 

cord  that  event  on  the  file?  However,  just  timeliness  does  not  adequately  describe  this 
attribute  of  quality.  Perhaps  an  additional  measure,  similar  to  what  Ballou  and  Pazer 
call  timeliness,  could  be  volatility.  This  would  be  defined  as  the  length  of  time  the  data 
is  expected  to  be  accurate.  The  value  for  volatility  would  vary  from  data  element  to  data 
element,  as  some  remain  constant  for  the  entire  life  of  the  record,  and  others  change  with 
different  frequencies. 

C.  AN  EARLY  CLASSIFICATION  SYSTEM 

The  early  research  in  the  area  of  data  quality  dealt  primarily  with  managing  errors. 
This  research  focused  on  the  input  process,  and  how  better  controls  there  could  reduce 
the  occurrence  of  errors.  In  1969,  Varley  wrote  a  paper  with  the  following  objective: 
"The  purpose  of  the  paper  is  to  provide  procedures  for  detecting  and  correcting  data 
input  errors  introduced  by  the  human  observer."  (1969,  p.l)  In  his  paper,  Varley  made 
frequent  mention  of  the  Standard  Navy  Maintenance  and  Material  Management  Infor¬ 
mation  System,  on  which  he  based  his  study.  He  was  able  to  generalize  many  of  his 
findings,  eventually  developing  "a  model-for  evaluating  the  various  detection  and  cor¬ 
rection  alternatives"  taking  into  account  "The  necessary  relationships  between  data 
worth,  accuracy  and  cost."  (Varley,  1969,  p.l) 
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At  the  time  of  Varlcy's  research,  not  much  was  published  regarding  the  data  input 
problem.  Initiatives  for  improving  the  quality  of  the  data  focused  on  character  recog¬ 
nition,  automatic  source  data  collection,  and  on-line  computer  systems  (Varlcy,  1969, 
p.44).  Interestingly  enough  two  of  these  three  initiatives,  character  recognition  and  on¬ 
line  computer  systems,  have  played  a  role  in  the  development  and  enhancement  of  the 
AES. 

In  order  to  better  understand  the  process  of  error  detection,  Varlcy  identified  seven 
independent  locations  where  it  could  take  place.  These  are: 

•  Data  Generator 

•  Data  Checker 

•  Keypunch  Location 

•  Local  Computing 

•  Central  Computing 

•  Data  Systems  Analysts 

•  Information  User 

He  explained  how  personnel  at  each  of  these  locations  in  the  data  collection,  processing, 
and  use  chain  can  often  detect  and  sometimes  correct  errors.  In  many  systems  today, 
especially  those  which  allow  on-line  input  of  data,  the  data  generator,  data  checker,  and 
keypunch  location  are  one  in  the  same.  This  is  true  of  the  Source  Data  System  (SDS), 
which  collects  data  for  the  AES. 

Having  defined  the  locations  where  errors  could  occur,  Varlcy  also  categorized  error 
detection  and  correction  procedures  based  on  the  resources  used.  This  classification 
system,  detailed  below,  is  useful  only  for  classifying  error  detection  and  correction  pro¬ 
cedures  typically  undertaken  during  the  maintenance  phase  of  the  Systems  Development 
Life  Cycle  (SDLC): 

•  System-manual  Procedure  Class  -  This  class  refers  to  using  manuals  which  provide 
detailed  procedures  for  collecting  data. 

•  Manual-visual  Procedure  Class  -  This  class  refers  to  verifying  data  using  catalogs 
or  reference  materials. 

•  Manual-EAM  [Electronic  Auditing  Machinery]  Procedure  Class  -  The  process  of 
manually  verifying  input  with  the  help  of  admissibility  checking  falls  in  this  class. 

•  Computer-aided  Validation/Admissibility  Procedures  -  This  class  refers  primarily 
to  admissibility  edits  and  relational  checking  between  data  elements. 

•  Computer-aided  Statistical  Procedures  -  "This  class  of  procedures  uses  cither  sta¬ 
tistical  inference  or  probability  techniques  for  estimating  the  presence  of  errors." 
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•  Computer-aided  Table  Look-up  Procedures  -  This  refers  to  verifying  input  values 
against  a  table. 

•  Computer-aided  Master  File  and  Cross-Reference  Table  Procedures  -  These  pro¬ 
cedures  are  those  which  use  outside  files  for  validation  and  possibly  correction  of 
data.  (Varley,  1969,  pp.  143-144) 

D.  CLASSIFYING  TECHNIQUES  FOR  DATA  QUALITY 

If  it  is  possible  to  identify  where  errors  are  generated,  and  to  target  a  particular  class 
of  techniques  to  correct  those  errors,  then  what  is  missing  to  achieve  effective  data 
quality  control?  According  to  Varley.  it  was  the  concept  of  error  priority  and  how  to 
establish  that  priority.  He  felt  that  there  should  be  a  means  to  identify  the  highest  pri¬ 
ority  errors  or  the  most  valuable  data.  As  Varley  defines  it,  "The  concept  of  error  pri¬ 
ority  can  be  stated  as  the  difference  between  the  worth  of  a  data  element  when  it  is 
accurate  and  the  worth  of  the  data  element  when  it  is  in  error."  (1969,  p.lOll  Error 
prioritv  then,  is  predicated  on  a  more  basic  concept,  the  concept  of  data  value.  "That 
is,  what  price  per  unit  of  accuracy  are  the  users  willing  to  pay  for  accurate  data.7  ' 
(Varley,  1969,  p.  157)  This  question  is  all  important,  for  it  prosides  a  framework  for 
measuring  how  much  it  is  worth  to  an  organization,  for  a  particular  data  element  to  be 
maintained  at  a  specific  level  of  accuracy.  Once  this  is  established,  it  can  be  weighed 
against  the  cost  of  doing  error  detection  and  correction  techniques  such  as  those  ex¬ 
plained  above.  Optimally  organizations  should  be  defining  the  worth  of  their  data,  and 
then  allocating  resources  to  maintain  the  data  within  specified  tolerances.  What  has 
probably  kept  organizations  from  measuring  data  worth,  after  all  Varley  wrote  his  paper 
20  years  ago,  is  the  difficulty  in  doing  so.  The  problem  that  exists  for  many  systems,  and 
certainly  for  those  that  are  the  object  of  this  study,  is  that  no  one  measure  of  data  worth 
will  do: 

In  most  cases  the  value  of  data  to  users  changes  from  data  element  to  data  element 
as  well  as  from  user  to  user.  This  is  more  the  rule  than  the  exception,  lhe  system 
designer  therefore,  must  decide  what  level  of  accuracy  should  prevail  for  each  of  the 
data  elements.  It  is  quite  possible  and  reasonable  to  assume  that  some  data  ele¬ 
ments  arc  easier  to  bring  to  a  given  level  of  accuracy  than  other  elements.  This  may 
require  the  system  designer  to  perform  the  cost  analysis  at  the  data  element  level 
rather  than  the  system  level.  (Varley,  1969,  p.164) 

This  can  be  a  tremendous  job  in  large  systems;  the  EMF  currently  carries  several  hun¬ 
dred  data  elements,  and  its  successor  data  base  is  designed  to  carry  over  500  (LDM 
Report,  1989,  p.l).  Economic  evaluation  of  the  worrh  of  each  of  these  data  elements 
would  be  difficult  at  best. 
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Since  Varlcy's  research,  other  techniques  for  improving  data  quality  have  hecn  in¬ 
troduced.  Now  there  are  other  "locations"  where  something  can  be  done  to  control  the 
presence  of  erroneous  data.  These  locations  are  outside  the  error  generation  and  cor¬ 
rection  process;  they  are  not  people  such  as  the  data  generator  or  systems  such  as  the 
central  computer.  These  locations  can  best  be  related  to  phases  in  the  SDLC.  In  an 
article  by  Brodie,  "The  role  of  data  quality  is  [was]  placed  in  the  lifecycle  framework. 
Many  new  concepts,  tools,  and  techniques  from  both  programming  languages  and  da¬ 
tabase  management  systems  are  [were]  presented  and  related  to  data  quality."  Q980, 
p.245)  The  article  concentrated  on  the  physical  characteristics  of  the  system,  those  re¬ 
lating  to  hardware  or  software.  Brodie  discussed  such  issues  as  data  types,  structured 
programming,  data  abstraction,  DD/DS,  and  data  definition  languages.  He  then  related 
them  to  six  stages  of  software  development  which  he  defined  as: 

1.  analysis  and  definition  of  requirements, 

2.  logical  design  and  its  specification, 

3.  implementation  and  design, 

4.  implementation  construction, 

5.  validation  and  verification,  and 

6.  operation,  maintenance,  and  evolution.  (Brodie,  1980,  p.248) 

Brodie  described  where  various  tools  and  techniques  can  be  used  in  his  SDLC  frame¬ 
work,  except  for  the  maintenance  phase,  which  he  says  is  surveyed  adequately  elsewhere. 

Using  Varlcy's  research  to  understand  where  errors  are  created,  and  where  and  how 
they  can  be  corrected,  and  Brodie  s  classification  scheme  involving  the  SDLC,  a  com¬ 
prehensive  way  to  categorize  techniques  for  achieving  data  quality  can  be  developed. 
Rather  than  focusing  strictly  on  ADP  related  issues  as  Brodie  did,  this  framework  also 
looks  at  IRM  issues  as  well.  It  advocates  using  a  philosophy  similar  to  that  Varlcy 
proposed,  where  data  value  is  a  driver.  Further,  data  quality  control  and  maintenance 
issues  are  addressed  early  in  a  system's  design.  In  this  way,  internal  controls  can  be 
designed  at  the  same  time  the  system  is  crafted.  Data  values,  relative  or  otherwise,  are 
established  early,  and  then  priorities  for  data  quality  enhancement  arc  set  from  the  sys¬ 
tems  inception. 

In  Table  2  the  phase  of  the  traditional  SDLC,  an  Object-Oriented  methodology, 
and  an  Information  Engineering  development  strategy  are  related  to  new  Data  Quality 
Technique  Classifications.  To  aid  comprehension,  in  the  explanation  that  follows,  col¬ 
umn  titles  appear  in  the  same  type  face  as  they  do  in  the  table.  The  Development 


Strategy  and  its  associated  Phase  arc  listed  in  the  first  column,  with  the  SDLC  Phases 
of  Testing  and  Maintenance,  covering  all  three  strategies.  For  some,  this  in  itself  may 
represent  a  leap  of  logic,  but  Brodie's  solution  to  that  issue  works  here  as  well: 

Most  popular  database  approaches  have  only  two  stages,  e.g.|,|  infological  and 
datalogical,  which  do  not  permit  an  appropriate  separation  of  concerns,  nor  do  they 
facilitate  the  integration  of  database  with  software  engineering  technology.  How¬ 
ever,  the  development  of  database  application  is  a  large  software  development 
project.  (19S0,  p.248) 

In  the  second  and  third  columns  Varley's  Error  Detection  Locations  and  Error  Detection 
and  Correction  Procedures  are  included  to  show  how'  they  typically  address  only  the 
maintenance  phase  of  a  software  project.  New  Data  Quality  Technique  Classificat’ons 
are  provided,  and  are  related  to  the  Development  Strategy  Phases.  In  the  last  column, 
Questions  to  Ask/Issues  to  Res'olve,  those  concerns  which  should  be  addressed  at  that 
stage  of  the  life  cycle  are  listed.  These  include  some  original  questions  and  some  posed 
by  previous  researchers. 

E.  METHODS  OF  IMPROVING  DATA  QUALITY 

Using  the  framework  just  developed,  the  following  paragraphs  explore  initiatives 
that  can  improve  data  quality. 

1.  Methods  Involving  Engineering  in  Quality 

Efforts  to  manage  data  quality  need  not  be  restricted  to  the  maintenance  phase 
of  the  SDLC.  There  are  various  methods  which  can  be  used  to  ensure  data  quality,  even 
before  an  ADP  system  has  started  to  collect  the  data. 

a.  The  Data  Base 

In  a  data  base,  "The  data  records  are  physically  organized  and  stored  so  as 
to  promote  sharcability,  availability,  evolvability,  and  integrity."  (Davis,  1985,  p.5<»2) 
The  ability  to  share  data  makes  it  more  accessible  to  users,  so  it  can  be  used  more  reg¬ 
ularly.  If  the  data  is  seen  more  often,  users  will  help  to  assess  and  possibly  improve  its 
accuracy.  Increased  integrity  also  means  improved  quality. 

b.  Data  Dictionary!  Directory  Systems 

A  DD/DS  can  help  to  improve  data  quality  in  several  ways.  When  estab¬ 
lishing  a  DD/DS,  data  should  be  standardized.  To  facilitate  data  standardization, 
naming  conventions  must  be  developed.  These  conventions  help  to  avoid  redundancy 
and  more  specifically: 

The  dictionary  helps  to  enforce  agreement  on  the  definition  of  each  field  and  its  bit 
structure.  It  helps  to  avoid  having  different  fields  in  different  places  with  the  same 
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name  (homonyms)  and  the  same  field  having  different  names  in  different  places 
(synonyms).  (Martin,  1981,  p.388) 


In  addition,  if  established  properly,  the  DD/DS  can  provide  information  which  will  make 
data  maintenance  easier  once  the  system  is  deployed: 

The  DD/DS  contains  valuable  audit  trail  information  about  the  data. 
For  example,  it  could  describe  in  detail  where  and  how  the  data  is  used,  and  it 
identifies  what  program  uses  the  data,  where  it  appears  in  the  program,  what  it  is 
used  for,  what  its  relationships  are  to  other  programs,  and  whether  any  transf¬ 
ormations  were  performed  on  the  data. 

The  DD/DS  contains  information  about  the  users  of  the  data,  who  they 
are,  what  they  do  with  the  data,  how  they  use  it.  and  so  forth.  It  describes  the 
physical  devices  that  process  data,  and  documents  the  software  that  use  it.  such  as 
a  DBMS  |Data  Base  Management  System].  In  addition,  the  DD/DS  also  contains 
information  about  the  kind  of  data  that  is  used  by  the  programs,  the  users,  the 
physical  device,  the  DBMS  -  these  are  all  entities  described  in  the  DD/DS. 

All  this  information  is  important  when  tracing  incorrect  data  entry  or 
unauthorized  access  into  the  data  processing  environment.  Fvaluating  this  infor¬ 
mation  can  help  identify  the  extent  of  an  error:  and  it  may  be  possible  to  identify  the 
person  responsible  for  perpetrating  the  error,  or  illegally  accessing  the  data  base. 
These  output  facilities  can  enhance  the  users  ability  to  use  the  DD/DS  as  an  audit 
trail  aid.  (Leong-IIong.  19S2,  p.55 ) 

2.  Methods  Involving  Data  Capture 

a.  On-line  Input 

How  does  on-line  entry  improve  data  quality?  First,  recording  events  as 
they  occur  will  allow  the  data  to  reflect  the  organization's  actual  status  more  quickly, 
increasing  overall  accuracy.  Second,  error  checking  and  validation  procedures  can  be 
executed  while  the  data  arc  being  collected.  Then  errors  can  be  corrected  on  the  spot, 
by  the  individual  who  is  entering  the  event  and  is  most  knowledgeable  about  it.  On-line 
entry  can  be  followed  by  immediate  or  batch  processing,  with  different  advantages  and 
disadvantages;  see  Davis  for  a  more  complete  discussion  (19S5,  p.  139). 

b.  Distributed  Data  Processing 

Distributed  Data  Processing  (DDP)  shares  many  of  the  same  advantages 
as  on-line  entry.  Essentially,  on-line  entry  and  data  validation  are  a  low-scale  form  of 
DDP.  For  an  application  to  be  highly  distributed,  portions  of  the  master  database  must 
be  maintained  at  different  locations.  Martin  provides  more  insight  on  the  advantages 
of  DDP: 

DDP  permits  data  entry  to  be  moved  back  to  the  user  departments.  There  are  se¬ 
veral  advantages  to  this.  User  departments  can  be  made  responsible  for  their  own 
input  data,  for  the  accuracy  and  completeness  of  the  data,  and  for  the  timeliness  of 
the  entry.  Validation  can  be  done  by  the  machines  as  the  data  are  entered ;  this  is 
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desirable  because  errors  can  be  corrected  immediately,  while  the  source  documents 

are  available.  The  laborious  step  of  kev  verification  following  kev  punching  can  be 

avoided.  (1981,  p.23) 

3.  Methods  Involving  Error  Detection 

In  addition  to  the  techniques  for  error  detection  and  correction  described  by 
Varley,  audit  packages  can  aid  in  error  detection. 
a.  Audit  Packages 

Sometimes  errors  can  be  detected  by  using  software,  besides  the  applica¬ 
tions  software,  to  check  the  master  file.  The  efficacy  of  these  packages  depends  on  many 
factors,  principally  the  type  of  data  collected  by  the  application.  Numeric  data  is  gen¬ 
erally  easier  to  audit.  Neumann  surveys  the  features  of  seven  audit  packages  in  a  l.'.S. 
Department  of  Commerce  Publication  (L’.S.  Department  of  Commerce,  1977). 

4.  Methods  Involving  Error  Correction 

Varley's  framework  covered  error  correction  techniques  fairly  well.  However, 
new  technology  has  provided  for  error  suspense  files. 
a.  Rotating  Error  Files 

According  to  Benoit  the  following  benefits  are  gained  from  using  a  rotating  error  or 
suspense  file: 

•  Error  rejects  are  controlled  to  prevent  loss 

•  Error  rejects  are  corrected  by  authorized  personnel 

•  Error  rejects  are  corrected  on  time 

•  Corrections  are  subjected  to  the  same  edit,  validation,  and  update  process  as  the 

original  entries 

•  Separation  of  duties  is  maintained 

•  The  audit  trail  remains  unbroken  (1979,  p.28) 

5.  Methods  for  Assessing  Data  Quality 

Measuring  data  quality  is  an  important  area,  where  few  useful  techniques  have 
been  proposed.  If  it  is  true  that  today's  managers  value  quality  data  over  more  data, 
and  if  it  is  true  that  many  systems  do  not  measure  the  quality  of  their  data  nor  what  they 
spend  to  maintain  it,  then  what  is  the  first  step  in  changing  this  situation?  A  logical  step 
would  be  to  try  and  get  a  handle  on  the  quality  of  the  data  that  exist  in  a  data  file  or 
data  base  at  present.  If  there  are  no  quality  standards  or  measurement  techniques  for 
monitoring  data,  how  can  those  responsible  for  data  maintenance  know  where  they 
stand.  In  the  software  industry  assessment  has  recognized  benefits: 
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As  data  arc  collected  and  projects  are  measured,  an  organization  improves  its 
understanding  of  the  software  development  process  within  its  environment,  and 
therefore  the  overall  process  is  improved.  (Valett.  19S9,  p.  137) 

To  further  the  parallel  with  software  measurement  techniques,  consider  the  following. 
Productivity  is  currently  measured  based  on  lines  of  code,  and  errors  are  currently 
measured  based  on  cost  per  error.  Both  of  these  measures  are  imperfect,  containing  bi¬ 
ases  against  fourth  generation  languages  in  the  case  of  lines  of  code,  and  biases  against 
quality  products  in  the  case  of  cost  per  error  (Abdel-Hamid.  I9S9).  However,  research¬ 
ers  have  made  the  decision  that  even  an  imperfect  measure  is  better  than  none  at  all. 
This  philosophy  should  be  adopted  in  the  study  of  data  quality,  because  until  some  at¬ 
tempts  to  quantify  the  data  quality  problem  are  made,  it  can  never  be  better  understood. 

There  are  various  ways  to  measure  errors  in  software.  These  include  seeding 
models,  where: 

A  program  is  randomly  seeded  with  a  number  of  known  'calibration'  errors.  Then 
the  program  is  tested  (using  test  cases).  The  probability  of  finding./'  real  errors  of  a 
total  population  J  (an  unknown)  errors  can  be  related  to  the  probability  of  finding 
k  seeded  errors  from  all  K  errors  embedded  in  the  code.  (Pressman,  1987.  p.4t>0) 

The  way  this  is  expressed  as  an  equation  is  that  if: 

K-  Implanted  errors 
n-  Errors  detected  in  testing 
A=  Implanted  errors  detected  and 
J=  Total  errors  in  the  program 
then  Ar  can  be  found  by 

/  ~  n  x  K 

~  k 

and  j  can  be  found  by 

j  —  J  -  K 

(Abdel-Hamid,  1989) 

Mathematical  techniques  are  valuable,  because  they  provide  simple  ways  to 
quantify  something  which  otherwise  is  not  easily  measured.  While  some  mathematical 
models  for  measuring  data  quality  have  been  developed,  these  have  not  gained  as  wide 
an  acceptance  as  software  quality  measurement  techniques. 
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Haber  and  associates  used  regression  analysis  to  determine  if  various  factors 
played  a  role  in  the  amount  of  reporting  to  a  large-scale  information  system  done  by 
operational  Navy  units.  The  final  goal  was  "to  address  the  prior  problem  of  identifying 
good  vs  poor  reporters  as  a  first  step  in  approximation  to  identifying  good  vs  poor  data." 
(Haber,  1972,  p.458) 

Ballou  and  Pazer  proposed: 

A  model  which  under  certain  condition  can  trace  the  propagation  and  alteration  of 
errors  in  data  items  within  information  systems.  It  also  handles  the  impact  of  faulty 
processing  on  data  items  within  information  systems.  It  produces  expressions  for 
the  magnitudes  of  errors  in  selected  terminal  outputs.  (1985,  p.  1 5 1 ) 

However,  for  this  model  to  be  used,  all  data  must  be  numeric. 

In  his  article,  Estimating  and  Improving  the  Quality  of  Information  in  a  MIS 
[Management  Information  System],  Morey  focused  on  measuring  data  quality  of  a  MIS 
used  for  manpower  planning  in  the  L'.S.  Marine  Corps  (1982,  p.37). 

6.  Resource  Allocation  Techniques 

Another  development  worth  mentioning,  is  that  researchers  are  exploring  how 
to  better  allocate  resources  for  data  quality  enhancement.  Optimization  routines  seek 
to  provide  the  data  maintainer  with  information  about  which  data  maintenance  or  error 
correction  techniques  should  be  applied  to  which  data  set  (Ballou,  19S9). 

F.  THE  STATE  OF  THE  ART  -  DATA  MAINTENANCE 

It  appears  that  many  of  the  organizations  tasked  with  maintaining  large  stores  of 
data  do  not  use  a  rationale  such  as  that  prescribed  in  the  framework.  Data  quality 
maintenance  is  a  hit  or  miss  proposition,  where  it  is  being  done  at  all.  In  a  study  of  the 
accuracy  of  data  bases  marketed  by  vendors,  Mahmoud  and  Rice  found  that  "Almost 
20  percent  of  the  database  suppliers  responding  to  the  survey  indicated  that  they  do  not 
check  the  accuracy  of  the  data  they  receive."  (1988,  p.248)  If  this  is  true  of  organizations 
who  market  their  data  for  profit,  what  does  that  say  about  organizations  who  collect 
their  data  for  internal  use? 

Perhaps  data  maintenance  suffers  from  the  same  lack  of  attention  that  software 
maintenance  has  over  the  past.  Researchers  seem  to  agree  that: 

Software  maintenance  has  until  very  recently  been  the  neglected  phase  in  the  soft¬ 
ware  engineering  process.  The  literature  on  maintenance  contains  very  few  entries 
when  compared  to  definition  and  development  phases.  Little  research  or  production 
data  have  been  gathered  on  the  subject,  and  few  technical  approaches  or  methods' 
have  been  proposed.  (Pressman,  1987,  p.527) 
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Maintenance  related  functions  are  certainly  less  glamorous  thai.  ;e  that  are  develop¬ 
ment  related;  programmers  typically  start  out  in  maintenance  and  are  promoted  to  de¬ 
velopment.  In  addition,  organisations  may  not  realize  that  "The  principle  of  entropy 
also  applies  to  stored  data.  The  maintenance  of  data  quality  requires  continuous  inputs 
of  resources."  (Davis,  1985,  p.6ll)  Managers  have  only  recently  recognized  the  benefits 
to  be  gained  from  engineering  software  to  reduce  the  need  for  certain  types  of  mainte¬ 
nance.  Perhaps  the  same  can  be  done  to  eliminate  some  data  maintenance.  Once  an 
error  is  identified,  it  costs  more  to  correct  than  if  the  product  or  information  were  perfect 
in  the  first  place.  In  software  this  is  true,  because  correcting  errors  can  result  in  the 
generation  of  more  errors.  It  is  also  true  with  data  because  of  a  number  of  factors,  in¬ 
cluding  such  things  as  communications  overhead  and  additional  processing  time.  It  isn't 
hard  to  imagine  that  large  benefits  might  be  reaped,  if  data  quality  could  be  engineered 
and  data  maintenance  procedures  improved. 

In  the  future,  data  maintenance  issues  should  be  addressed  early  in  the  SDLC.  Too 
often  in  the  past,  these  factors  were  not  addressed  until  the  system  was  operational  and 
the  data  was  already  being  collected.  Varley  recognized  the  importance  of  comparing 
data  maintenance  costs  with  the  value  delivered,  and  more  recent  research  has  focused 
on  making  these  determinations  up  front: 

In  assessing  the  establishment  of  databases,  an  important  factor  to  be  considered  is 
the  probability  that  the  integrity  of  the  data  (accuracy,  completeness,  etc.)  can  be 
maintained.  The  assessment  should  also  consider  the  risk  from  a  degradation  of 
data  quality.  (Davis,  1985,  p.611-612) 

G.  CHAPTER  SUMMARY 

In  this  chapter  a  framework  for  classifying  all  data  quality  initiatives  was  developed. 
To  facilitate  a  better  understanding  of  the  framework,  a  working  definition  of  data 
quality  was  provided,  and  the  components  of  data  quality  were  identified.  Then,  an  early 
framework  for  classifying  data  quality  initiatives  particular  to  the  maintenance  phase  of 
the  SDLC  was  presented,  and  the  new  classification  system  was  proposed.  Using  this 
system,  developments  which  contribute  to  enhanced  data  quality  were  described.  The 
framework  developed  here  will  provide  a  reference  point  for  the  survey  of  data  quality 
controls  in  the  NES  environment,  presented  in  the  next  chapter. 
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III.  NES  DATA  QUALITY  CONTROL 

In  this  chapter,  two  deficiencies  in  the  data  quality  control  environment  of  NFS  are 
identified.  They  are  revealed  by  surveying  the  organizational  controls  which  exist,  with 
respect  to  the  Data  Quality  Initiatives  framework  established  in  Chapter  II.  These  de¬ 
ficiencies.  the  inability  to  accurately  assess  the  data  quality  of  the  EMF,  and  the  need 
for  enhanced  data  management  when  the  EMF  transitions  to  a  new  data  structure,  will 
be  further  addressed  in  Chapters  IV  and  V.  Since  the  organization  which  supports  the 
NFS  is  extensive,  and  the  survey  of  NES  data  quality  initiatives  would  not  be  clear 
without  some  background  information,  the  organizational  environment  and  software 
characteristics  of  NES  are  first  described.  To  gain  an  understanding  of  how  hard  it  is 
to  assess  the  overall  effectiveness  of  the  extended  organization  which  supports  NFS, 
difficulties  in  measuring  the  costs  of  maintaining  EMF  data  and  in  measuring  the  quality 
of  the  data  are  discussed.  The  information  presented  in  this  chapter  will  provide  a  basis 
for  understanding  the  complexity  of  data  quality  problems  in  the  MPT  support  systems 
and  the  difficulty  involved  in  eradicating  these  problems. 

A.  CHARACTERISTICS  OF  THE  ORGANIZATION 

1.  The  Total  Force  Information  Systems  Management  Department 

The  Total  Force  Information  Systems  Management  Department  (NMPC-16)  is 
responsible  for  providing  IRM  support  sen  ices  to  the  Navy's  MPT  community.  In  ad¬ 
dition  to  supporting  each  of  the  components  of  MPT,  the  department  also  has  financial 
management,  mobilization,  and  even  limited  pay  system  responsibilities.  In  I9SS.  the 
department  underwent  a  major  reorganization.  The  purposes  of  this  reorganization  were 
to  move  towards  a  future  structure  headed  by  a  Chief  Information  Officer  and  to  facili¬ 
tate  end-user  computing.  "'Data'  oriented  functions  have  been  separated  from  tech¬ 
nology'  oriented  functions."  (CIRMP,  1989,  p.1-10)  The  paragraphs  below  explain  the 
new  organization  in  relation  to  the  data  quality  function,  which  is  shared  by  a  number 
of  divisions. 

The  Customer  Support  Division  (NMPC-163)  is  responsible  for  providing  the 
MPT  community's  information  needs.  They  are  involved  in  numerous  functions,  from 
developing  small  systems  for  end-user  computing  to  assisting  with  the  preparation  of 
reports  and  the  management  of  contracts  (Director,  Reorganization,  1988).  Their 
mission  statement  also  requires  them  to  assess  customer  satisfaction. 
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The  Data  Management  Division  (NMPC-164)  is  tasked  with  "Implementjingj 
and  execute(ing]  programs  to  maintain  and  improve  the  suffiew.  <  v,  accuracy,  timeliness, 
integrity,  reliability,  accessibility,  and  security  of  MPT  data."  (Director,  Reorganization, 
1988)  Most  of  the  programs  for  data  management  are  now  handled  or  should  eventually 
be  absorbed  by  this  division. 

The  Corporate  Data  Systems  Division  (NMPC-165)  is  responsible  for  the 
maintenance  of  the  applications  programs  which  create  data  bases  such  as  the  EMF. 
Their  tasks  which  relate  to  data  quality  include  developing  programming  edits  and  en¬ 
suring  that  program  changes  do  not  adversely  affect  the  data  base. 

The  Field  Personnel  Systems  Division  (N’MPC-166)  manages  the  systems  that 
collect  the  data  at  the  input  source.  These  are  systems  such  as  the  SDS,  which  provides 
local  activities  with  the  capability  to  record  events  (such  as  activity  losses  or  gains  and 
withheld  promotions)  in  an  automated  manner,  as  they  occur.  This  division  must  ensure 
that  their  systems  provide  for  timely  and  accurate  data  input,  another  key  element  in 
data  quality  maintenance. 

Finally,  the  Technology  Support  Division  (NMPC-167)  manages  the  hardware, 
including  telecommunications  networks,  needed  to  collect  and  process  data.  They  con¬ 
tribute  to  data  quality  by  facilitating  such  things  as  timely  transmission  and  processing 
of  NFS  inputs.  This  responsibility  is  shared  with  the  Consolidated  Data  Center  (CDC) 
in  Cleveland,  Ohio,  where  NES  file  maintenance  and  reports  are  actually  processed. 

An  NMPC-16  organization  chart  appears  in  Figure  2. 

2.  The  Data  Management  Division 

While  many  parts  of  the  NMPC-16  organization  play  a  role  in  ensuring  that  the 
EMF  data  is  of  sufficient  quality,  the  responsibilities  of  assessing  and  ensuring  data 
quality  fall  primarily  on  the  Data  Management  Division  (NMPC-164).  Two  branches 
of  this  division  are  tasked  with  maintaining  data  quality,  each  in  a  different  way.  The 
Corporate  Data  Maintenance  Branch  (NM  PC- 1641)  contains  the  Enlisted  Research 
Correction  Section  (NMPC-164 IE).  This  section  is  responsible  for  correcting  trans¬ 
action  errors  from  the  NES  updates  and  for  updating  erroneous  personnel  records  in 
response  to  field  inquiries.  The  Data  Implementation  Branch  (NMPC-1642)  contains 
the  Data  Quality  Program  Section  (NMPC-1642C).  This  section  "Establishes  and  im¬ 
plements  procedures  to  measure,  maintain,  improve,  and  report  data  quality."  (Director, 
Reorganization,  1988) 

The  Data  Quality  Program  Section  acquired  a  full-time  manager  in  November 
of  1989,  and  it  is  still  in  the  process  of  building  its  staff  after  the  reorganization. 
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Upon  establishment  this  section  will  be  responsible  for  Data  Quality  at  the  imple¬ 
mentation  level  and  will  use  the  Data  Quality  Guideline  as  the  basis  for  the  prepa¬ 
ration  of  implementation  guidance  for  the  echelon  two  and  three  commands  under 
its  purview.  (CIRMP,  1989,  p.4-9) 

Managers  would  like  this  section  to  set  goals  and  priorities  according  to  need.  Devel¬ 
oping  tools  or  techniques  to  measure  data  quality  and  to  assign  priority  to  data  files  or 
elements  would  facilitate  this. 

The  Enlisted  Research  Correction  Section's  two  basic  functions  deal  with  cor¬ 
recting  specific  instances  of  erroneous  data.  Their  labor-intensive  activities  have  been 
studied  by  the  Naval  Audit  Service  and  by  Troy  Systems  in  recent  years,  in  an  attempt 
to  make  them  more  efficient.  Some  recommendations  from  these  studies  have  been  im¬ 
plemented  and  some  have  not.  Later  in  this  chapter,  these  recommendations  will  be 
summarized. 

B.  CHARACTERISTICS  OF  THE  SOFTWARE 

NOS  started  functioning  as  the  Navy's  official  information  system  on  enlisted  per¬ 
sonnel  in  July  of  1973.  It  is  a  transaction-oriented  system,  which  is  updated  in  a  batch 
processing  mode.  Software  developers  used  terminology  and  processes  common  to  fi¬ 
nancial  accounting  systems,  probably  borrowing  much  of  the  design  from  those  early 
systems.  However,  the  development  process  was  not  simple: 

Systems  functions  were  difficult  to  obtain  as  functional  managers  of  the  day  had  a 
poor  understanding  of  how  to  define  what  the  system  should  do.  To  a  large  extent 
system  developers  had  to  decide  what  the  data  requirements  for  the  data  base  would 
be  and  what  mechanisms  would  be  used  to  collect  the  data.  Furthermore,  rules  for 
editing  and  updating  were  also  constructed  by  the  ADP  developers.  (Milestone  IV 
System  Decision  Paper,  p.2) 

Obviously,  this  has  had  serious  consequences  throughout  the  life  of  the  system,  and  even 
today  affects  the  quality  of  the  stored  data. 

NES  updates  occur  daily;  special  monthly,  quarterly,  and  year-end  processing  is  also 
done.  Various  inputs  for  the  updates  are  collected  in  Washington  DC.  Then,  they  are 
bulk  data  transferred  and  processed  further.  N.MPC-16  is  billed  for  this  processing 
which  takes  place  at  the  CDC.  The  CDC  supports  much  of  the  pay  and  personnel 
community: 

Hardware  consists  of  IBM  3081  and  3084  mainframes,  IBM  3380  and  3350  disk,  and 
a  3851  mass  storage  system.  This  equipment  is  connected  through  high-speed  data 
communications  lines  to  remote  centers  at  Washington,  D.C.  and  Cleveland.  The 
complex  supports  batch  and  interactive  processing  using  IBM's  MVS/XA  operating 
system  and  Systems  Network  Architecture.  (CNO,  1988,  Appendix  9) 
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Specific  software  characteristics  include  the  following: 

•  260  modules 

•  900  programs 

•  400,000  lines  of  code 

•  Interfaces  with  25  other  systems 

•  Average  of  50,000  transactions  per  daily  update 

•  Average  of  90  requests  for  changes  or  ad  hoc  reports  are  active  at  any  time 

•  Edits  are  verification,  rather  than  exception  oriented 

•  It  uses  State  and  Country  Code  Tables,  Navy  Enlisted  Classification  Code  (NEC) 
Tables,  Rate  Tables,  Unit  Identification  Code  Tables,  Language  Code  Tables, 
Professional  Pay  Tables,  and  Submarine  Pay  Tables  to  validate  data  (Monroe,  Slide 
Show) 

C.  CHARACTERISTICS  OF  THE  DATA  AND  DATA  BASE 

The  F.MF  which  NES  creates  has  the  following  characteristics: 

•  Averages  630,000  records 

•  Record  Information: 

•  A  maximum  of  3000  bytes 

•  Average  record  length  is  990  bytes 

•  Record  data  categories  include: 

■  Personnel  Data 

•  Rate/  Rating 

■  Skill  Data  (NEC's) 

■  Service  Data 

■  Evaluations 

•  Duty  Preference 

•  Current  Activity/Duty  Assignment 

•  Availability 

■  Assignment  Data 

■  Career  History 

•  School  History 

•  Users  of  the  data  include: 
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Navy 

Department  of  Defense 


Chief  of  Naval  Operations 

Civilian  Agencies  (Monroe,  Slide  Show) 


D.  NES  EXPENSES 

1.  Cost  of  Running  NES 

It  is  important  that  a  quality  data  base  be  maintained,  so  that  the  resources  used 
in  collecting  and  processing  the  data  are  well-spent.  The  information  in  Table  3  should 
provide  some  perspective  on  the  cost  of  NES.  These  values  reflect  only  central 
processor,  tape  processing,  and  direct  access  storage  device  use.  They  do  not  include 
telecommunications,  data  collection,  or  other  costs.  They  were  provided  by  the  Navy 
Finance  Center  (NFC)  in  Cleveland,  and  are  an  output  of  the  Resource  Accounting 
System  (RAS).  If  the  costs  displayed  for  each  month  are  added  and  averaged,  an  ap¬ 
proximate  monthly  cost  can  be  obtained.  If  that  cost  is  multiplied  by  12,  the  approxi¬ 
mate  yearly  charge  of  NES  File  Maintenance  and  Reports  together,  is  over  a  million 
dollars. 


Table  3.  NES  COSTS  FOR  SEVEN  MONTHS  ( 1989) 


Month 

NES  File  Maintenance 

NES  Reports 

March 

S76.712 

S3 1,545 

April 

S89.072 

$40,328 

May 

S7 1.859 

S3 1 ,563 

June 

S71.728 

S3  3,072 

July 

S70,706 

S39.980 

August 

SS  1,472 

S32.618 

September 

S80.057 

S5 1,676 

YearIy(Avg.) 

$928,467 

$447,055 

2.  Costs  Associated  with  Data  Maintenance 

Costs  associated  with  maintaining  data  quality  include  researchers'  salaries, 
processing  costs,  micro-fiche  or  document  reproduction  costs,  and  communications 
overhead  (telephone  calls  and  meetings).  As  a  sample  of  these  costs,  the  approximate 
salaries  of  the  Enlisted  Research  Correction  Section  are  displayed  in  Table  4.  The  ci¬ 
vilian  salaries  arc  taken  from  a  January  1989  General  Schedule  (GS)  pay  chart,  assuming 
step  five  for  each  grade.  The  military  salaries  are  taken  from  the  basic  pay  chart  effective 
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January  1989  (assuming  E-8,  16  years;  E-6,  8  years;  E-5,  6  years;  E-4,  4  years;  E-3,  2 
years). 


Table  4.  ENLISTED  ERROR  RESEARCH  SALARIES 


Old  Organization 

NMPC- 1641 E 

Grade/ Rating 

Salary 

Grade/ Rating 

Salary 

GS-204-9 

S27.026 

GS-204-9 

S27.026 

GS-204-7  (2) 

S44.186 

GS-204-7 

S22.093 

GS-204-6  (2) 

S39.764 

GS-204-6 

S19.882 

GS-204-5  (11) 

S 1 96,2  8 

GS-204-5  (2) 

S35.676 

YN  PNCS 

S23.448 

YN  PNCS 

S23.448 

YN  PM  (3) 

S48.34S 

YN  PN1  (2) 

YN  PN2  (3) 

S42.336 

YN  PN2  (2) 

S2S.224 

YN  PN3  (2) 

S24.984 

YN  PN3  (1) 

$12,492 

YN  PNSN  (3) 

$30,888 

YN  PNSN  (2) 

$20,592 

Total 

$477,198 

Total 

S22 1.665 

When  attempting  to  identify  costs  involving  data  quality  maintenance  for  \ES. 
each  cost  had  to  be  gleaned  from  a  different  source.  Costs  of  processing  came  from 
NFC.  manning  information  and  transaction  data  from  NMPC,  and  pay  information 
from  Civilian  Personnel  and  Military  Disbursing  Offices.  All  this  information  was  nec¬ 
essary,  just  to  figure  out  a  rough  approximation  of  the  cost  of  researching  and  reapply¬ 
ing  one  erroneous  transaction.  This  cost  is  displayed  in  Figure  3.  Of  course,  this  figure 
does  not  include  the  overhead  costs  mentioned  before.  If  an  organization  wants  to  get 
a  better  idea  of  whether  or  not  it  is  beneficial  to  perform  a  particular  data  maintenance 
activity,  they  need  to  have  a  quicker  way  to  make  that  assessment.  The  procedure  for 
figuring  out  the  cost  of  correcting  a  transaction  required  several  months  worth  of  effort 
to  complete,  due  to  the  lack  of  readily  available  management  information.  Appendix  B 
contains  a  summary  of  the  statistics  used  to  determine  the  averages  provided  in  the  fig¬ 
ure. 

E.  USING  THE  DATA  QUALITY  INITIATIVES  FRAMEWORK 

Various  measures  are  being  taken  in  the  MPT  IRM  organization  to  improve  data 
quality.  These  efforts  span  a  continuum  from  attempting  to  engineer  in  quality  to  de¬ 
veloping  more  efficient  ways  to  correct  errors.  Using  the  framework  described  in 
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YEARLY  AVERAGES: 


NES  TOTAL  INPUT  TRANSACTIONS  7,376,778 

NM PC- 16  ERROR  RESEARCH  TRANSACTIONS  238,781 

NMPC-16  INPUT  FILE  CORRECTIONS  256,669 

NES  FILE  MAINTENANCE  PROCESSING  COST  S928.467 

ERROR  RESEARCHERS  SALARY  COSTS  5221,665 

COST  OF  PROCESSING  ONE  TRANSACTION  .13c 

COST  OF  HANDLING  ONE  MAINTENANCE  ACTION  ,45c 

COST  PER  TRANSACTION  .5Sc 


Figure  3.  Cost  of  Correcting  one  Transaction 

Chapter  II,  the  paragraphs  below  survey  past  and  current  initiatives  targeted  at  im¬ 
proving  data  quality.  From  this,  two  deficiencies  are  identified.  These  arc  the  inability 
to  accurately  assess  the  quality  of  the  data  in  the  EYIF,  and  the  need  for  enhanced  data 
quality  management  when  NES  transitions  to  a  new  system. 

1.  Efforts  Aimed  at  Engineering  in  Quality 

Often  the  quality  of  data  can  be  improved  by  better  planning  and  design.  As  in 
any  other  quality  assurance  situation,  it  is  usually  cheaper  to  do  something  right  the  first 
time. 

a.  Information  Benefit  Analysis 

Information  Benefit  Analysis  (IBA)  plays  a  role  in  improved  data  quality. 
The  purpose  of  IBA  is  to  identify  solutions  to  business  problems  or  to  explore  strategic 
opportunities,  by  studying  the  organization  and  how  it  functions.  Once  this  has  been 
accomplished,  a  data  analysis  is  done.  Data  analysis  is  a  nine  step  process  which  seeks 
to  identify  problems  or  opportunities,  find  solutions,  and  assess  the  value  of  these  sol- 
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utions.  Benefits  identified  may  be  tangible  or  intangible.  (DCXO,  1BA  Guideline,  1989, 

P-1-12) 

The  CRIMP  states  that  the  "Information  Benefit  Analysis  Methodology 
(1BA)  is  used  to  evaluate  information  needs  and  set  priorities  for  allocation  of  re¬ 
sources."  (CIRMP,  1989,  p.2-27)  The  I  BA  is  currently  used  before  the  first  milestone 
of  Life  Cycle  Management,  that  is  before  the  requirements  of  a  system  are  even  deter¬ 
mined.  Since  I  BA  forces  the  system  planner  to  quantify  information  benefits  (though 
not  necessarily  monetarily),  some  of  these  assessments  can  be  carried  over  to  the  re¬ 
quirements  analysis  phase.  Perhaps,  using  I  BA,  the  relative  value  of  data  elements  could 
then  be  determined. 

b.  Data  Issue  Resolution 

Data  Issue  Resolution  provides  for  consistency  when  data  elements  are  to 
be  added,  changed,  or  deleted  from  a  corporate  data  base.  In  NMPC-16  and  throughout 
the  MPT  community,  managers  saw  the  need  to  have  someone  to  oversee  these  issues: 

In  addressing  data  integrity  and  continuity  of  operations,  a  centralized  Information 
Systems  Change  Comptroller  is  required  to  ensure  that  data  issue  resolution  which 
results  in  chanee  is  executed  corrcctlv,  accuratelv,  and  in  a  timelv  manner.  (CIRMP, 
1989,  p.4-15) 

c.  Data  Standardization 

Data  element  standardization  has  many  benefits.  Its  primary  focus  is  to 
facilitate  better  sharing  of  resources  and  to  provide  for  more  efficient  support  of  the 
MPT  community  (DCNO,  Data  Element  Standard,  1989.  p.l).  The  MPT  IRM  Program 
Data  Element  Standard  addresses  such  issues  as  Data  Element  (DE)  Design.  DE  Defi¬ 
nition,  DE  Naming,  DE  Approval  and  DE  Registration.  A  corollary  benefit  to  data  el¬ 
ement  standardization  is  improved  data  quality.  This  is  true  for  several  reasons.  First, 
interfacing  with  other  systems  does  not  require  processing  overhead  to  convert  from  one 
format  to  another,  where  potential  mistakes  or  misunderstandings  can  occur.  Second, 
the  definitive  source  and  the  valid  values  for  a  particular  data  item  are  established,  and 
then  they  are  documented  to  prevent  confusion  and  misinterpretation. 

d.  The  Information  Resources  Encyclopedia 

Information  developed  as  a  result  of  data  element  standardization  is  entered 
in  the  Information  Resources  Encyclopedia  (IRE).  A  sample  IRE  entry  is  found  in 
Figure  4.  While  it  is  sometimes  referred  to  as  an  active  system,  the  IRE  currently  is  only 
a  data  base  of  standardized  data  elements.  In  the  future,  \MPC-16  plans  to  investigate 


32 


the  possibility  of  an  independent,  active  system.  There  arc  plans  to  enhance  the  IRE  in 
the  following  way: 

A  PC-based  [Personal  Computer),  uscr-friendlv  front-end  to  the  MPT 
and  PAY  IRE  will  be  developed  and  implemented  in  1990,  and  training  in  the  use 
of  the  front-end  will  be  provided  to  the  MPT  and  PA  Y  community  [communities). 

The  IRE  will  be  populated  with  the  IMPDB  [Integrated  Military  Per¬ 
sonnel  Data  Base  -  successor  to  the  EMF|  logical  and  physical  design  specifications; 
standardized,  corporate  IMPDB  data  elements;  and  MPT  corporate  data  architec¬ 
tures  in  1990.  .  .  .  The  IRE  will  have  an  automatic  update  capability,  with  no  manual 
data  entry  required.  This  will  ensure  that  the  corporate  metadata  repository  ,  which 
documents  the  MPT  corporate  information  system  environment,  remains  valid  and 
up-to-date.  .  .  . 

A  fully  developed,  two-way  interface  between  the  IRE  and  the  corpo¬ 
rate  DBMS  directors  will  be  in  place  in  1993.  The  interface  will  automate  the 
process  of  creating  logical  database  designs  and  will  generate  a  physical  corporate 
database  design  from  the  logical  design  information.  .  .  .  (C1RMP,  19S9.  p.4-1 1) 

If  this  automated  development  of  data  base  designs  includes  validation  routines,  it  would 
enhance  the  role  the  IRE  plays  in  ensuring  data  quality.  Then,  all  applications  would 
generate  their  validation  routines  through  the  IRE.  This  would  eliminate  a  significant 
problem,  which  is  keeping  the  edits  on  all  satellite  systems  in  synchronization  with  those 
of  the  master  system.  It  appears  that  NMPC-16  does  have  plans  to  standardize  vali¬ 
dation  routines  in  the  future,  but  it  may  not  be  done  under  the  umbrella  of  the  IRE 
(though  DD/DSs  often  have  validation  meta  data).  In  another  document,  which  de¬ 
scribes  a  three-level  architecture  for  N MFC's  future  systems,  the  IRE  is  described  as  a 
corporate  data  base: 

This  database  will  contain  information  necessary  for  the  planning,  design,  develop¬ 
ment  and  maintenance  of  information  systems.  This  information  may  be  distributed 
across  multiple  physical  databases  and  embedded  in  ADP  management  support 
systems.  (Software  Solutions,  19S8,  p.4) 

Later,  the  document  describes  another  layer  of  the  architecture,  Sy  stem  Utility  applica¬ 
tions,  which  are  "those  used  to  update  and  insure  [ensure]  the  integrity  of  the  corporate 
data."  There  are  four  of  these  systems  mentioned,  including  a  Table  Maintenance  Sys¬ 
tem,  an  I/O  (Input/Output)  Validation  System,  a  Transaction  Processing  System,  and 
a  Secondary  Database  Creation  System  (Software  Solutions,  198S,  p.4).  If  this  strategy 
is  used,  three  of  the  four  systems  named  above  will  play  a  role  in  ensuring  data  quality. 
The  Table  Maintenance  System  will  contain  any  tables  required  for  validating  input,  the 
I/O  Validation  System  will  contain  the  admissibility  edits,  and  the  Transaction  Process¬ 
ing  System  will  contain  the  relational  edits  (those  which  must  check  data  in  the  master 
data  base  to  be  performed)  (Software  Solutions,  1988,  p.4).  Whether  the  synchroniza- 


33 


PRINT  OP  PAT-MADE- 

ENCODES  IN  N1433 

til  DA 

ITEPi 

DD2DI 

HELD- AS  ALPHAS  2 

IISDt 

HELD  2  ALPHAS  1 

11491 

HELD  $  ALPHAS  4 

I2IDI 

DE3CR1PTIM 

D292D 

•THE  PAT  MAM  !N  MUCH  THE  HOME*  IS  CURRENT LT*. 

D2DSI 

•SERVIHS.* 

92199 

CATAIMUE 

D2I1D 

•ADMITS*.  'SAMPS* 

•HANCLASS'.  'ACMP'.  '70S'.  'IMS'. 

D211S 

•NES-REPMT-43(*. 

•NES-DA1LT',  •JO-SERIES'.  'NES-HONTHLT' , 

D222D 

•WWW',  'NXTftAS' 

•HID',  'DTP',  'SUMPS'.  'NMRDD', 

#212$ 

•RTSS*.  'RTSS-SUR 

'.  'RTSS-AIR* ,  'RTSS-TM' ,  'NES-REPORT'. 

#21*# 

•RESULTS',  'PEOPLE' ,  'NONET'.  'NMDAS' ,  'HES*.  'NES-DAUT'. 

•215# 

•MES-REPMT-47C, 

•  tf 04T-441 • ,  •HE5-4E704T-452* , 

#2154 

•HES-#UARTERLT' ■ 

921  St 

•••-SERIES* .  *14 

-SERIES' , 

92194 

•NES-REPMT-42I* 

02147 

•43-SERIES'.  'NES 

-REPORT-427  • 

•  Sit# 

COfWCHT 

($1(2 

‘COM  DCSCRIPTIM 

93194 

*1  ENLISTED*. 

93194 

•2  OPPICES* , 

«$!•* 

•3  MARRANT*. 

($11# 

•4  @3*. 

•  $112 

•3  NAOC  ORAM1 

•  Ill* 

•4  NL  (LEADER) 

•  1114 

•7  KS  (SUPMVISO*)', 

93114 

••  HD* . 

•  312# 

••  HR*. 

•  3122 

••  HA' 

•  31*4 

•PCRMISSIILE-VAIUES* 

93144 

•PATORAM  RANK 

PAYOdADC  RAM  • 

93159 

•  El  SR  1 

ESS  • 

•3132 

•  E2  SA  2 

LTJO  • 

•  3134 

•  E3  SN  3 

IT  • 

mr.  i  m  s 

•  E4  POS  4 

ICM  9 

K  •  trWm 

•  El  P02  3 

COR  ' 

■  '  nlBS 

•  E4  P01  4 

CAST  • 

93142 

•  C7  CTO  7 

RASM 

1«M  12 

•*  ••.)!. 1* 

HANAOER  SOFTMARE  PRODUCTS 

PAOE 

•• 

UPDATE 

STATUS  N1453 

DICTIONARY  S1PROJ 

PRINT  or  PAT-MAD e-V  ENCODED  IN  MASS 

(COST) 

•3144 

ES  3CPO  • 

RASH 

•  3144 

E*  HCPO  » 

VADM 

•  314* 

•  HI  NO 

•  317* 

•  M2  CMC  • 

•  3172 

•  HI  OR)  • 

•1174 

•  m  cm  9 

END  OP  PRINT 

Figure  4.  IRE  Entry  for  the  Data  Element  Pay-Grade 


tion  of  edits  is  achieved  by  the  IRE  or  by  the  other  system  described,  common  edits, 
which  can  be  simultaneously  updated,  should  significantly  improve  data  quality. 

2.  Efforts  Aimed  at  Data  Capture 

Conventional  wisdom  about  data  quality  indicates  that  it  is  most  beneficial  to 
validate  information  as  it  is  collected.  Editing  or  validating  at  the  input  source  elimi¬ 
nates  overhead  and  allows  the  most  knowledgeable  person  to  correct  the  data.  Of 
course,  this  was  not  always  possible  in  the  early  days  of  information  processing, 
a.  The  Source  Data  System 

In  the  last  decade,  the  SDS  was  designed  and  implemented  at  many  shore 
Personnel  Support  Detachments  in  the  continental  United  States.  This  system  contains 
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a  portion  of  the  EMF  (and  other  corporate  data  bases)  called  a  mini-master.  It  allows 
olTices  responsible  for  personnel  and  pay  accounting  to  update  the  master  data  bases  in 
a  timely  and  efficient  way.  Many  of  the  edits  or  validation  checks  performed  by  the 
master  file,  are  also  performed  at  the  local  activity,  allowing  erroneous  information  to 
be  corrected  before  it  is  recorded.  Since  there  is  some  potential  for  differences  between 
the  master  and  the  mini-master,  and  because  not  all  edits  can  be  done  in  local  software, 
it  is  still  possible  for  transactions  to  fail  to  apply  to  the  EMF  in  a  NFS  update. 

b.  Elimination  of  Optical  Character  Recognition 

At  one  time  Optical  Character  Recognition  (OCR)  was  a  new,  promising 
technology.  Apparently,  its  usefulness  in  Navy  personnel  accounting  has  been  dis¬ 
proved.  In  many  cases,  the  scanning  process  created  additional  errors.  In  fact,  overall 
"the  success  of  OCR  has  been  limited.  OCR  devices  are  still  relatively  expensive  and 
their  reliability  lower  than  other  input  devices."  (Weber,  19S2.  p. 22 1 )  Consequently, 
NMPC-16  has  done  away  with  scanning  as  much  as  possible. 

c.  Improving  Timeliness 

MPT  1R  managers  have  gotten  together  and  established  a  task  force  to  de¬ 
termine  what  represents  an  appropriate  measure  of  timeliness.  In  addition,  the  task 
force  is  seeking  to  set  standards  for  improving  timeliness: 

We  are  currently  revising  the  way  timeliness  is  measured.  Our  goal  for  100°  o  [in  a 
later  document  (Teter,  1989)  this  was  changed  to  99.5%]  submission  is  15  days.  The 
goal  includes  a  5  day  window  for  field  preparation,  10  days  for  mailing  and/or 
transmission  of  data  from  various  input  systems  (DMRS  [Diary  Message  Reporting 
System],  EPMAC  [Enlisted  Personnel  Management  Center  -  responsible  for  DMRS 
input]  SDS,  and  OCR).  The  timeliness  statistics  will  be  measured  in  calendar  days 
from  the  date  of  occurrence.  Future  date  transactions,  retroactive  starts,  changes, 
L'A(s)  [unauthorized  absences],  wavier  requests,  NRA(s)  [Navy  Recruit  Accessions] 
and  corrections  will  be  excluded.  (Commander,  Draft  of  Letter  Subject:  Timeliness 
Performance  Report) 

3.  Efforts  Aimed  at  Error  Detection 

There  are  several  locations  and  methods  used  to  detect  errors.  Those  used  in 
NES  now,  or  planned  for  future  use  are  described  in  the  sections  that  follow. 

a.  Edits  and  Reconciliation 

Errors  are  detected  in  several  different  ways.  Transaction  errors  are  caught 
by  edits  at  the  field  in  SDS,  at  EPMAC  in  DMRS,  and  during  NES  updates  for  all  input 
systems.  In  addition,  errors  are  detected  through  the  use  of  reconciliations  with  other 
data  bases  such  as  the  Social  Security  Administration's  System  or  the  Joint  Uniform 
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Military  Pay  System  (JUMPS).  These  reconciliations  are  conducted  on  a  monthly  basis. 
The  JUMPS  reconciliation: 

A.  Provides  assurance  that  data  on  the  master  record  are  identical  to 
related  data  in  the  personnel  system  for  the  same  member. 

B.  Identifies  the  aspects  of  the  update  processing  which  may  require 
modification  to  keep  the  financial  system  "in  line"  with  the  personnel  system. 

C.  Provides  a  periodic  review  of  the  validity  of  data  maintained  in  the 
personnel  system  and  forwarded  to  NAVFINCEN  [NFC).  (JDRM,  1990,  p.2-2) 

In  any  situation  where  there  is  a  no-match  condition,  a  report  is  printed,  and  error  re¬ 
search  is  conducted.  In  some  cases,  correction  transactions  arc  generated,  and  are 
processed  in  the  next  JUMPS  update.  Periodic  queries  and  file  sweeps  are  also  done  on 
an  ad  hoc  basis  to  identify  trouble  spots. 

b.  NES  Update  Statistics 

During  each  daily  update,  XES  File  Maintenance  (,F/M)  Update  Statistics 
are  produced.  These  statistics  tell  how  many  transactions  of  each  type  were  entered  in 
the  update,  how  many  failed  to  process,  and  the  error  code  each  failed  transaction  was 
assigned.  The  reports  also  contain  information  on  the  age  of  certain  records  which  are 
in  an  exception  status.  Those  familiar  with  the  statistics  can  often  identify  major  errors 
or  problems  with  an  update.  These  problems  would  include  things  such  as  improperly 
sequenced  or  garbled  tapes  being  processed. 

c.  On-line  Error  Trends 

There  has  been  discussion  regarding  the  development  of  a  data  base  of  NFS 
F/M  Update  Statistics.  Then,  analysis  of  this  data  base  could  be  conducted  to  determine 
whether  a  particular  update  produced  error  rates  within  set  tolerances.  This  would  re¬ 
duce  the  organization's  reliance  on  "experts"  who  have  been  reviewing  statistics  for 
years,  and  would  allow  large  processing  errors  to  be  caught  more  reliably. 

■4.  Efforts  Aimed  at  Error  Correction 

The  last  way  to  achieve  data  quality  is  to  actually  correct  records  identified  as 
erroneous.  This  is  the  final  check  point  for  ensuring  the  accuracy  of  data.  The  para¬ 
graphs  that  follow  provide  a  summary'  of  XMPC-16's  ongoing  initiatives  for  improving 
the  error  correction  process. 

a.  The  NES  On-line  Correction  System 

NES  now  uses  an  automated  suspense  file  to  control  rejected  input  trans¬ 
actions.  As  explained  in  the  last  chapter,  there  are  a  number  of  benefits  associated  with 
a  rotating  error  file.  The  NES  On-line  Correction  System  (NOCS)  is  an  interactive  sys¬ 
tem  which  provides  for  on-line  viewing  of  erroneous  transactions.  It  also  allows  a 
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transaction  to  be  generated  and  submitted  into  the  daily  XES  update.  The  system  is 
used  by  the  Enlisted  Research  Correction  Section  to  turn  around  transaction  errors  from 
the  daily  updates  or  to  submit  correction  transactions. 
b.  Management  Reviews 

To  improve  the  efTectiveness  of  the  Enlisted  Research  Correction  Section, 
NMPC-16  solicited  the  help  of  two  outside  organizations.  Troy  Systems  did  several  data 
quality  reviews  in  late  19S8  and  throughout  19S9,  and  in  1987,  the  Naval  Audit  Service 
provided  an  assist  visit.  The  outcome  of  these  investigations  is  briefly  discussed. 

f  ];  Troy  Systems  -  September  I9S9.  On  September  29,  19S9,  Troy  Sys¬ 
tems,  Incorporated  completed  a  Data  Quality  Improvement  Report  for  N MFC- 1642. 
The  work  performed  under  this  contract  centered  on  resolving  problems  with  two  par¬ 
ticular  data  elements,  citizenship  and  place  of  birth.  The  report  contained  recommen¬ 
dations  concerning  restructuring  edits  and  standardizing  tables  for  coding  inputs.  One 
of  the  report  findings  indicated  that  the  only  place  the  data  were  output  was  to  the 
Central  Adjudication  Facility,  responsible  for  approving  security  clearances.  NMPC-16 
has  a  Memorandum  of  Understanding  tasking  them  to  provide  support  to  this  organ¬ 
ization.  (Troy,  September,  1989,  pp.l-10)  The  work  done  under  the  contract  was  ob¬ 
viously  worthwhile,  but  the  dollars  spent  on  resolving  that  data  quality  problem  might 
have  been  better  spent  on  a  need  more  pressing  to  the  Navy's  MPT  community.  When 
compared  with  all  MPT  priorities,  improved  security  data  might  not  deliver  the  most 
value  to  the  MPT  managers.  Unfortunately,  until  data  elements  arc  assigned  a  value 
and  data  maintenance  priorities  are  compared  to  one  another,  optimal  allocation  of  data 
maintenance  resources  cannot  be  achieved. 

(2)  Troy  Systems  -  July  I9S9.  On  July  7,  1989,  Troy  Systems.  Incorpo¬ 
rated  completed  a  report  on  error  research  and  correction  analysis,  for  N'MPC-164.  (t 
contained  a  number  of  useful  recommendations  which  are  detailed  in  the  paragraphs 
that  follow. 

Their  first  recommendation  was  that  a  section  be  established  within 
NMPC-1641  to  do  data  error  analysis,  instead  of  just  error  correction  (Troy,  July  19S9, 
no  page  numbers).  At  present,  the  Data  Quality  Program  Section  is  just  getting  staffed 
up.  The  functions  that  would  be  done  by  a  data  error  analysis  section  could  also  fit  here, 
in  NMPC-1642C,  depending  on  at  what  level  of  detail  NMPC-16  managers  want  to  split 
the  data  maintenance  function. 

The  next  recommendation  was  that  the  XOCS  be  enhanced  (Troy, 
July  1989).  While  the  name  N'ES  On-line  Correction  System  seems  to  indicate  that  the 
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corrections  are  made  on-line,  they  are  actually  batched  and  processed  in  a  regular  daily 
NFS  update.  The  nature  of  the  recommendations  for  NOCS  enhancements  varied. 
Some  were  designed  to  give  the  manager  of  the  error  research  section  better  productivity 
statistics  with  which  to  measure  researchers.  Others  were  crafted  to  provide  transaction 
statistics  by  input  source,  and  still  others  were  geared  to  improving  researcher  produc¬ 
tivity  by  providing  more  tools. 

The  report  also  recommended  that  the  Enlisted  Research  Correction 
Section  "standardize  procedures  documentation"  (Trov,  July  1989).  While  the  report  did 
not  elaborate  any  further  on  this  subject,  it  appears  to  be  a  valid  recommendation.  All 
of  the  researchers  who  correct  erroneous  transactions  are  civilians,  who  would  have  little 
knowledge  of  Navy  or  ADP  terminology  when  hired. 

A  function  which  consumes  a  great  deai  of  time  in  N\1  PC- 164 1C  is 
providing  error  correction  assistance  to  field  activities  and  individuals  who  call  for  help. 
The  Troy  report  recommends  that  some  of  these  functions  be  transferred  to  NMPC-I63. 
the  Customer  Support  Division  (Troy,  July  1989). 

Currently,  error  research  for  correcting  transactions  must  be  done  by 
using  either  paper  or  microfiche  transaction  reports.  A  recommendation  to  provide  an 
on-line  transaction  file  was  included  in  the  Troy  report.  The  ofTicer  system  already  has 
this  capability,  and  the  enlisted  system  could  benefit  from  it  as  well.  Questions  such  as 
how  many  months/years  of  transactions  can  be  stored  will  need  to  be  resolved,  as  the 
volume  of  the  enlisted  system  far  exceeds  that  of  the  ofTicer  system. 

The  report  included  a  recommendation  to  move  transaction  error 
correction  to  the  source  where  the  transaction  was  created  (Troy,  July  19S9).  This  is  a 
useful  recommendation  in  many  cases,  but  only  with  a  certain  note  of  caution  which 
was  not  mentioned  in  the  report.  Well-meaning  managers  might  see  this  as  a  panacea, 
as  a  way  to  eliminate  the  transaction  correction  function  all  together.  In  fact,  cuts  to 
manning  have  already  been  based  partially  on  the  assumption  that  in  the  future,  the 
need  for  transaction  error  correction  at  headquarters  would  be  greatly  reduced.  How¬ 
ever,  as  long  as  there  are  edits,  and  there  is  potential  for  the  edits  to  differ  between  sys¬ 
tem,  transactions  will  error  out  of  the  master  file  update.  Not  all  of  these  transactions 
can  be  more  easily  corrected  at  the  input  source. 

The  report  provided  two  final  recommendations.  These  were  that  the 
personnel  system  to  pay  system  discrepancies  be  analyzed  further  and  that  all  OCR 
transaction  processing  be  eliminated.  The  department  is  actively  implementing  these 
recommendations. 
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f/i)  Naval  Audit  Service  -  19S7.  In  May  of  1987  the  Naval  Audit  Service 
completed  a  Management  Consulting  Report  for  NMPC-16.  At  tJiat  time,  before  the 
reorganization,  "NMPC-1654  was  the  branch  responsible  for  maintaining  the  officer  „nd 
enlisted  master  file  (OMF  and  EMF)  for  the  Manpower  [Personnel]  and  Training  In¬ 
formation  Systems  (MAPTIS)."  (Hickman,  1987,  p.I)  The  executive  summary  of  this 
report  provided  the  following  analysis: 

NMPC-1654  exerts  an  enormous  manual  effort  to  maintain  data  in  the  automated 
information  systems.  While  data  quality  assurance  certainly  requires  some  human 
oversight,  there  are  several  areas  within  NMPC-1654  in  desperate  need  of  auto¬ 
mation.  There  are  also  efficiencies  to  be  gained  by  reorganizing  the  sections, 
changing  work  procedures,  and  improving  the  work  environment.  (Hickman,  1987, 
p.I) 


An  issue  addressed  twice  in  this  chapter  already,  who  is  looking  out 
for  error  trends  and  how  are  they  doing  it.  appears  in  this  report  as  well.  In  19S7,  there 
was  a  section  dedicated  to  this  function.  It  was  the  Data  Systems  Analysis  Section 
(NMPC-1654C),  which  contained  one  Lieutenant,  one  other  military,  and  one  GS-12 
civilian.  "This  section  researchesfed]  and  analyzes[ed]  the  data  in  the  master  files,  look¬ 
ing  for  error  trends  and  anomolies  which  could  erode  the  validity  of  the  data  base." 
(Hickman,  1987,  p.7)  However,  Hickman  questioned  their  method  of  setting  priorities: 

NMPC-1654C  plays  an  important  role  in  actively  searching  for  problems  in  the  da¬ 
tabase;  however,  the  guidance  on  what  to  research  is  mostly  self-generated.  There 
are  no  formal  guidelines  or  priorities  for  error  detection,  and  therefore,  the  section 
appears  to  be  in  a  reactive  mode  when  problem  solving.  (Hickman.  1987,  p.  10) 

Hickman  also  says  when  error  trends  are  identified  and  programming 
changes  are  necessary  (as  frequently  they  are),  it  would  be  more  efficient  to  implement 
corrections  if  quality  assurance  and  applications  programming  personnel  reported  to  the 
same  boss  (Hickman,  1987,  p.I  1).  This  was  not  the  case  in  the  department  then,  and  is 
not  the  case  in  the  new  organization.  The  suggestion  has  merit,  for  reasons  beyond 
those  in  the  report.  Often,  isolation  of  problems  requires  programming  intervention,  this 
is  likely  another  reason  why  some  data  maintenance  functions  are  still  being  done  by  the 
Corporate  Data  Systems  Division  today. 

Lastly,  Hickman  also  recommended  that  incoming  NFS  transactions 
be  maintained  on-line  for  research  purposes  (Hickman,  1987,  p.  14).  This  is  already  done 
with  transactions  in  the  officer  system,  and  is  a  valuable  tool  for  identifying  trends  and 
resolving  processing  problems. 
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5.  Methods  for  Assessing  Data  Quality 

There  is  no  single  way  to  measure  the  quality  of  E M  F  data,  and  there  are  no 
published  standards  regarding  E.VIF  data  quality.  However,  both  of  these  issues,  meas¬ 
uring  data  quality  and  setting  quality  standards,  are  being  addressed  at  all  levels  of 
management  within  OP-16/NMPC-I6  and  the  MPT  community.  Tfforts  such  ns  estab¬ 
lishing  a  task  force  to  improve  the  way  timeliness  is  measured  and  publishing  a  policy- 
document  about  data  quality  standards,  are  steps  in  the  right  direction.  Unfortunately, 
the  problems  in  both  of  these  areas  are  complex  and  not  easily  resolved.  Data  quality 
has  a  number  of  different  components,  some  of  which  can  be  measured  more  easily  than 
others.  Until  data  quality  can  be  measured,  data  quality  standards  will  serve  no  function 
as  management  tools.  NMPC-16  regularly  measures  two  of  the  components  of  data 
quality,  completeness  with  reasonable  success,  and  timeliness  with  increasing  success. 
In  addition,  the  accuracy  of  selected  data  elements  is  measured  by  making  comparisons 
with  other  files. 

The  completeness  of  EMF  data  is  measured  monthly  and  summarized  in  the 
NFS  Element  Count  report.  This  report  is  useful  for  assessing  the  completeness  of  fields 
such  as  sex,  place  of  birth,  or  term  of  enlistment,  as  every  individual  record  should  con¬ 
tain  an  entry.  It  is  less  useful  for  fields  such  as  language  ability  or  school  completion 
date,  as  absence  of  data  may  signify  either  a  lack  of  the  qualification  on  the  sailor  s  part 
or  an  incomplete  record  in  the  file.  A  portion  of  a  sample  report  appears  in  Figure  5. 

The  timeless  of  inputs  to  the  NES  is  also  measured.  This  component  of  quality 
is  more  difficult  to  assess.  There  are  multiple  systems  putting  data  into  NES  and 
measuring  timeliness  in  a  different  way.  As  an  outgrowth  of  this  realization,  a  task  force 
on  timeliness  was  formed.  "This  issue  was  introduced  at  the  February  1988  Pay/Pcr- 
sonnel  Interface  meeting  with  NAVFINCEN  (NFC),  Code  6  taking  the  lead  .  .  .  This 
'task  force'  was  established  to  standardize  the  collection,  measurements,  and  reporting 
of  data."  (Teter,  1988,  p.l)  Information  regarding  the  timeliness  of  one  of  the  systems 
which  provides  inputs  to  NES  can  be  found  in  Table  5.  This  information  comes  from 
the  DMRS,  a  personnel  information  collection  system,  managed  by  EP.VIAC  in  New 
Orleans.  These  statistics  are  based  on  the  date  the  event  occurred  compared  to  the  date 
time  group  of  the  naval  message  in  which  the  event  was  recorded.  In  some  cases  it  is 
appropriate  that  the  event  date  be  in  the  past,  for  example  a  retroactive  entry.  This 
skews  these  statistics  slightly.  NES  is  only  a  part  of  the  Manpower,  Personnel,  and 
Training  Information  System  (MAPTIS). 
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Figure  5.  EMF  Data  Element  Count  Report 
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Table  5.  TIMELINESS  OF  DM RS  INPUTS  TO  MAPTIS 


(OCTOBER  1988  -  SEPTEMBER  1989) 


Number  of  Days 

1-5 

6-7 

8-15 

16-30 

Number  of  Transactions 

268,518 

23.239 

20,326 

43,816 

Percentage  of  Total 

76% 

■H 

6°o 

12% 

Source:  Curran,  EPMAC,  1989. 


The  accuracy  of  certain  XES  data  is  measured  in  comparison  to  the  data  in  the 
JUMPS.  This  is  not  a  measure  of  absolute  accuracy,  but  it  does  document  whether  the 
data  in  NFS  matches  that  in  JUMPS.  The  NES/JUMPS  disparity  standard  is  one-half 
of  one  percent,  with  a  long  range  goal  of  zero  percent  (DCNO  MOU,  19S9,  Tab  F).  In 
the  past,  file  disparity  statistics  were  produced  and  standards  were  issued  for  each  data 
element.  This  is  no  longer  being  done,  as  of  the  1989  Memorandum  of  Understanding 
(MOU).  Since  the  latest  disparity  statistics  produced  were  for  19S4-19S5,  they  are  not 
current  assessments  of  EMF  quality. 

There  is  an  organization  in  the  Chief  of  Naval  Operation's  staff  (OP-16)  which 
provides  policy  guidance  to  the  MPT  fRM  community.  The  branch  responsible  for 
Data  Resource  Management  has  developed  a  MPT  IRM  Data  Quality  Guideline.  "This 
guideline  is  the  first  step  in  promoting  a  data  quality  program  for  MPT  corporate  data." 
(DCNO,  Data  Quality  Guideline,  1988,  p.2)  As  a  policy  document,  it  must  apply  to  all 
MPT  systems,  and  can  contain  only  general  information  about  data  quality.  Therefore, 
it  does  not  attempt  to  set  quality  standards.  The  NMPC-16  organization,  described  in 
detail  earlier,  is  tasked  \  ith  implementing  this  policy. 

While  data  quality  is  and  always  has  been  a  concern  in  the  MPT  IRM  organ¬ 
ization,  user  perception  of  the  quality  of  the  data  bases  has  not  always  been  positive. 
NMPC-16  cannot  afford  to  consider  this  situation  a  user  problem.  If  managers  fail  to 
use  vital  data,  because  they  perceive  it  as  unreliable,  strategic  opportunities  may  be 
missed.  The  Navy's  IR  Program  is  founded  on  the  recognition  that  information  plays 
just  such  a  strategic  role  in  managing  Navy  business. 

As  discussed  above,  user  perceptions  must  be  dealt  with.  Especially  now,  when 
user  assessments  are  an  important  and  accepted  measure  of  data  quality: 


The  three  methods  generally  used  to  examine  the  data  quality  in  large  files  are  sur¬ 
veys  of  end  users  or  clients,  samples  of  entire  record  files,  and  samples  of  active  or 


current  cases.  Surveys  of  end  users  typically  measure  "perceptions"  of  data  quality 
and  are  fraught  with  problems  of  recall,  self-report  bias,  and  serious  underestimates. 
(Laudon,  1986,  p.6) 

However,  in  absence  of  a  better  method  surveys  may  be  useful.  The  bottom  line  is  that 
WIPC-l 6  could  use  another  organizational  technique  for  assessing  data  quality.  The 
picture  regarding  data  quality  is  not  as  clear  as  the  department  would  like  it  to  be.  Such 
oblique  statements  as  "The  accuracy  of  the  input  data  had  varied  over  time,"  point  this 
out.  (Milestone  IV  System  Decision  Paper,  p.8) 

6.  Resource  Allocation  Techniques 

Currently,  there  are  no  specific  techniques  used  to  allocate  resources  towards 
improved  data  quality.  Decisions  are  based  on  queries  from  higher  echelons,  requests 
from  MPT  managers,  and  the  gut-feelings  of  the  applications  programmers.  This  is  an 
area  where  operations  research,  properly  applied,  could  enhance  managers'  decisions. 

F.  THE  EMF  -  TRANSITION  TO  A  DATABASE  SYSTEM 

In  the  1990s,  \MPC-16  intends  to  transition  the  Officer,  Enlisted.  Reserve,  and  In¬ 
active  master  files,  to  the  Integrated  Military  Personnel  Data  Base  (IMPDB).  This 
transition  affords  IR  managers  and  systems  designers  an  opportunity  to  use  not  just  a 
data  driven  strategy,  but  a  quality  data  strategy,  to  develop  the  data  base  of  the  future: 

The  goals  of  the  IMPDB  are:  to  provide  a  cradle-to-grave  view  of  each  Navy  mem¬ 
ber's  career;  to  reflect  the  official  service  record:[;J  to  provide  a  single,  authoritative 
source  for  corporate  data  about  Navy  members;  to  be  consistent  for  all  functions 
using  the  data  (standardized);  to  be  organized  in  the  most  appropriate  way  to  serve 
the  needs  of  users  of  the  data;  and  to  be  valid  and  available  to  users.  (Hill,  19S8, 
p.l) 

NMPC-16  has  completed  the  first  and  second  iterations  of  designing  the  logical  data 
model.  The  General  Functional  Requirements  (GFR)  first  published  in  June  of  I9S7, 
were  revised  by  Tidewater  Consultants  in  August  of  1989.  Many  areas  of  IMPDB  re¬ 
quirements  are  covered  thoroughly  in  the  GFR.  However,  the  information  regarding 
howr  improved  data  quality  will  be  achieved  in  IMPDB  is  sparse.  The  GFR  section  on 
IMPDB  objectives  states  that  "policies  and  procedures  need  to  be  established  and  im¬ 
plemented  to  assure  that  data  quality  is  maintained,"  but  the  only  reference  to  data 
quality  stated  that  "Stored  and  transmission  data  error  rates  should  not  exceed  industry 
standards."  (Tidewater,  1989,  pp.3-1  and  3-4)  The  Software  Architecture  Level  I  Docu¬ 
ment  also  contains  some  references  to  a  Management  Procedures  and  Standards  System 
which  "will  'house'  the  rules  on  procedures  and  standards  with  which  the  MPT  commu- 
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nity  must  comply."  (Software  Solutions,  198S,  p.6)  These  include  such  things  as  "The 
overall  error  rate  occurring  on  pay  related  transactions  will  be  less  than  5%,"  and 
95%  '  of  pay  related  errors  will  be  corrected  in  10  days."  (Software  Solutions,  1988.  p.6) 
There  are  also  references  to  a  Process  Monitoring  and  Control  System  which  "will 
monitor  the  environment,  and  predefined  processes  iu  perform  statistical  analysis  in 
functional  areas."  (Software  Solutions,  1988.  p.6)  However,  the  document  itself  gives 
only  sample  standards  and  processes,  and  it  later  asks  "How  will  the  policies  and  pro¬ 
cedures  established  by  these  systems  be  enforced?"  (Software  Solutions.  1988,  p.6)  The 
issues  of  how  to  improve  data  quality  and  how  to  measure  data  quality  in  IMPDB  need 
to  be  addressed  in  more  detail  now.  Management  control  systems  to  support  this  goal 
need  to  be  defined  for  future  development.  Chapter  V  will  explore  how  NMPC-16  could 
use  improved  technology  to  start  designing  some  of  these  controls. 

G.  DATA  QUALITY  AND  THE  ROLE  OF  THE  USERS 

Up  to  this  point,  \.MPC-I6's  role  in  maintaining  the  quality  of  the  MP  T  data  has 
been  discussed  at  length.  So  as  not  to  leave  out  an  important  aspect  of  data  quality 
control,  the  following  paragraphs  survey  how  the  users  help  maintain  data. 

1.  Data  which  Impacts  Pay 

Some  of  the  information  contained  in  the  personnel  file,  affects  the  pay  of  the 
military  member.  Both  the  member  and  the  servicing  pas  office  have  a  vested  interest 
in  ensuring  that  this  data  which  impacts  pay  is  accurate.  The  Leave  and  Earnings 
Statement  (ITS)  is  produced  monthly,  it  displays  this  information  (though  from  a  dif¬ 
ferent  source  -  JUMPS)  to  the  member  and  the  payroll  clerk.  Since  these  data  hurt  the 
sailor  in  the  wallet,  they  are  probably  the  data  users  spend  the  most  time  and  energy 
trying  to  maintain. 

2.  Personnel  Data 

Every  month,  the  Enlisted  Personnel  Management  Activity  produces  the  En¬ 
listed  Distribution  Verification  Report  (EDVR).  The  EDVR  instruction  contains  the 
following  information,  which  encourages  personnel  offices  to  keep  data  up-to-date: 

5.  Accuracy  of  the  EDVR  -  Manning  and  assignment  decisions  are  based  upon  in¬ 
formation  contained  in  the  EDVR.  It  is  extremely  important  that  each  activity  keep 
its  account  up-to-date  and  accurate  by  reporting  personnel  events  as  they  occur  and 
correcting  errors  when  identified.  (NAVMILPERSCOM  Instruction  1080. 1 D,  19S9, 

p.2) 

The  Distribution  Support  Division  (NMPC-47)  has  many  elements  of  the  EMI- 
displayed  to  detailers  in  the  Personnel  Information  Module  of  the  Navy  Military  Pcr- 
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sonncl  Distribution  System.  Often,  an  informal  validation  of  these  data  is  conducted 
when  a  detailer  is  negotiating  orders  with  a  constituent. 

3,  Strategic  Data 

Data  from  NES  are  aggregated  and  used  for  such  functions  as  force  planning, 
promotion  planning,  and  accession  planning  by  the  Chief  of  N'avaL  Operations  Staff 
(OPN'AV)  for  MPT.  These  strategic  data  can  be  compared  to  previous  data  to  identify 
trends.  Sometimes  the  OPNAV  staff  can  identify  error  trends  through  analysis  of  ag¬ 
gregated  data,  better  than  other  users,  who  look  at  the  data  an  individual  at  a  time. 

H.  CHAPTER  SUMMARY 

XMPC-16  is  a  mature  ADP  organization,  with  a  long  history  of  managing  large  and 
complex  information  systems.  Their  efforts  to  achieve  improved  data  quality  have  been 
broad  and  effective,  however,  the  quality  of  EMF  data  cannot  be  reasonably  assessed. 
It  would  benefit  the  users  and  maintainers  of  the  EMF  if  data  quality  coulc  be  meas¬ 
ured.  Users  would  have  more  confidence  in  the  data  and  could  set  different  maintenance 
priorities  for  various  data  elements.  Data  maintenance  resources  could  then  be  allocated 
more  effectively.  This  would  also  allow  the  maintainers  to  develop  a  comprehensive  plan 
for  data  quality  control.  As  the  EMF  transitions  to  a  data  base,  new  opportunities  for 
improvement  present  themselves.  Many  of  the  questions  presented  in  the  Data  Quality- 
Initiatives  framework  presented  in  Chapter  II,  can  be  addressed  as  the  data  base  is  de¬ 
signed.  At  this  point  data  value  can  be  established  and  drive  future  data  quality  en¬ 
hancement  priorities.  Chapter  IV  will  propose  a  method  for  measuring  the  quality  of 
EMF  data,  and  Chapter  V  will  propose  ways  to  engineer  data  quality  controls  into  the 
new  data  base  and  recommend  new  technologies  which  can  be  brought  to  bear  on  the 
data  quality  problem. 


IV.  ME  AS  l  MG  DATA  QUALITY  IN  THE  EMF 

In  this  chapter,  a  technique  for  assessing  data  quality  in  the  EMF  will  be  presented. 
The  technique  is  then  tested  on  a  small  scale.  This  analysis  will  serve  as  a  baseline  for 
the  Data  Management  Division,  for  the  Enlisted  Research  Correction  Section,  or  for 
other  thesis  students  who  may  desire  to  test  this  technique  more  fully  or  devise  others 
which  are  more  appropriate  to  the  NES  application. 

A.  A  WAY  TO  MEASURE  DATA  QUALITY 

NMPC-16  needs  a  better  way  to  measure  the  quality  of  data  in  the  EMF-'.  Haber 
and  associates  used  linear  regression  to  determine  whether  good  or  poor  data  quality 
could  be  related  to  good  or  poor  input  sources  (in  this  case  good  versus  poor  reporting 
was  assessed  based  on  quantities  of  input).  This  technique  vas  not  considered  to  be 
appropriate  for  application  to  the  NES  environment,  due  to  the  fact  that  reporting  vol¬ 
umes  can  be  correlated  with  many  factors  besides  accurate  reporting.  The  researchers' 
assumption  that  reporting  volumes  can  be  correlated  with  the  quality  of  the  data  being 
input,  while  appropriate  for  a  maintenance  system,  did  not  seem  reasonable  in  the  case 
of  NES.  Ballou  and  Pazer  proposed  a  model  which  could  produce  an  expression  for  er¬ 
ror  magnitudes  in  output,  but  it  was  only  appropriate  for  applications  where  all  data  is 
numeric. 

In  1982,  Morey  published  a  paper  in  which  he  derived  several  equations  to  estimate 
the  stored  error  rate  in  a  Management  Information  System  (MIS).  In  deriving  these 
equations,  he  recognized  the  importance  of  the  feedback  systems  concept  and  the  dis¬ 
position  of  rejected  transactions,  on  the  overall  stored  MIS  error  rate.  He  tested  his 
equations  on  the  leave  transaction  of  the  Marine  Corps'  system  for  manpower  manage¬ 
ment.  In  describing  the  type  of  system  upon  which  his  technique  could  be  used,  he  said: 
"The  MIS  addressed  is  one  where  records  in  a  MIS  are  updated  as  changes  occur  to  the 
record,  e.g.,  a  manpower  planning  MIS  where  changes  may  relate  to  a  service  man's 
rank  or  skills.”  (Morey,  1982,  p.337)  Since  this  describes  the  NES  rather  well,  it  ap¬ 
peared  feasible  that  Morey's  technique  would  allow  NMPC-16  to  quantify  the  overall 
quality  of  the  EMF.  The  estimation  technique  is  based  on  the  relationships  between 
three  measures  of  data  quality:  the  transaction  error  rate,  the  intrinsic  transaction  error 
rate,  and  the  stored  MIS  error  rate  (Morey,  1982,  pp.337-338).  To  describe  the  proba- 
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bilitics  of  various  dispositions  of  new  transactions,  he  used  a  decision  tree.  This  decision 
tree  is  presented  in  Figure  6. 


1  la  Ota  cocun^cocy.  i ht  frwaiow  of  the  taler  lacuoa  uitervaJ  (0.  T)  which  the  record  ts  tacorract.  a  the  analler  of  C>/  T  tad  I 
'  la  (ha  case,  the  fracuoo  of  <0.  7)  over  which  the  record  a  ia  error  a  the  wutUer  of  (Ci  ♦  C»)/T  aad  I 
1  la  tha  cue  the  fractioa  of  (0.  T)  over  which  the  record  a  ta  error  a  the  taelkr  of  (C.  ♦  C»)/T aad  1. 

*  la  thu  case  the  record  a  ta  error  for  the  cam*  tatcrval  (0.  Ty 


Figure  6.  Morey's  Decision  Tree 

Since  it  was  unrealistic  to  gather  parametric  data  for  all  types  of  transactions  proc¬ 
essed  by  \ES,  only  six,  from  over  130  possible,  were  identified  for  data  collection.  A 
listing  of  all  the  transactions  and  their  purpose  is  provided  in  Appendix  C.  It  is  impor¬ 
tant  to  note  that  transactions  vary  greatly  in  their  input  format,  in  their  data  element 
contents,  and  in  the  associated  edit  routines  through  which  they  are  processed. 2 

The  transactions  were  chosen  based  on  several  criteria.  The  first  criterion  was  that 
failed  transactions  must  be  researched,  in  large  part,  by  an  NMPC  office.  There  are 
some  failed  transactions  which  are  always  returned  to  the  input  source  for  correction, 
and  there  would  be  no  simple  way  to  gather  statistics  on  the  disposition  of  these.  The 
second  criterion  was  that  the  transaction  volume  and  transaction  error  volume  must  be 
great  enough  that  a  sample  could  be  obtained  within  a  reasonable  time.  The  last  crite¬ 
rion  was  that  there  not  be  known  programmatic  problems  which  could  create  false  er¬ 
rors.  Transactions  were  narrowed  down  to  the  following:  A68,  C21,  E38,  GIB,  QC1, 
1FL,  1SR,  200,  300,  328,  340,  355,  382,  630,  and  798,  by  scanning  a  N-1652  Monthly 
Transaction  Totals,  Overall  Report  dated  7  April  1989.  A  page  from  this  report  appears 

2  A  similarity,  every  transaction  starts  with  a  three  character  alpha-numeric  code  to  identify 
its  type,  followed  by  the  social  security  number  and  five  characters  of  the  last  name,  to  identify  the 
enlisted  member. 
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in  Figure  7.  Those  transactions  listed  are  the  ones  with  over  100  errors  indicated  for 
N-1652  research  (the  report  still  says  N-1652,  as  the  codes  have  not  been  updated  since 
the  department  reorganized).  To  decide  which  of  these  transaction  to  collect  data  on, 
discussions  with  various  NMPC  staff  and  contractor  personnel  were  conducted.  In  ad¬ 
dition,  a  transaction  not  handled  by  NMPC-16's  researchers  was  selected.  Information 
about  the  selected  transactions  appears  in  Table  6. 


Table  6.  TRANSACTIONS  FOR  DATA  COLLECTION 


TAC 

TAC  TITLE 

PURPOSE  OR  DESCRIPTION 

A68 

Prospective  Rate 
Abbreviation 

Update  or  correct  prospective  rate. 

QCO 

Availability 

Inform  detailers  that  members  are  available  for  im¬ 
mediate  assignment. 

300 

Discharge  - 
Immediate 
Reenlistment 

Process  those  members  who  have  reenlisted  within 

24  hours  after  discharge. 

301 

Name  Change 

Change  or  correct  a  member's  name  to  agree  with 
official  documents. 

328 

Present  Rate 
Abbreviation 

Update  present  rate. 

340 

Court 

Memorandum 

Forward  to  NFC  all  guilty  courts  martial  Endings, 
all  NJP's  which  affect  pay  and  rate,  administrative 
actions  or  restoration  of  above. 

Source:  Active  Duty  Enlisted  Data  Elements  Catalog,  Appendix  A 


1.  The  Parameters  in  Morey's  Formula 

An  analysis  of  how  each  parameter  of  Morey's  estimation  formula  would  be 
determined  was  conducted.  A  detailed  description  of  this  analysis  is  provided  in  the 
paragraphs  that  follow. 

First,  Morey  described  the  transaction  reject  rate:  "The  transaction  reject  rate 
[r  is],  i.e.,  the  proportion  of  incoming  transactions  which  fail,  either  correctly  or  incor¬ 
rectly,  the  various  edits  and  logical  tests  used.''  (1982,  p.338)  To  determine  this  param¬ 
eter  the  N-1652  MAPMIS  Monthly  Transaction  Totals,  Overall  Report  was  used.  19 
months  of  data  were  averaged  to  determine  an  overall  estimate  of  r.  These  statistics 
were  taken  from  1989  and  1987  reports,  mainly  because  that  is  what  the  organization 
had  available  for  release.  The  reason  only  19  months  of  data  were  used  is  that  the  re¬ 
ports  for  June,  November,  and  December  1987,  and  January  and  May  of  1989  were 
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Figure  7.  MAPMIS  Monthly  Transaction  Totals 


missing.  Lotus  123  Release  2.01  was  used  to  tally  and  compute  the  averages.  Averages 
for  two  years  worth  of  data  were  then  averaged  together  for  an  overall  figure.  A  sum- 
man.  of  these  transaction  reject  rates  appears  in  Table  7,  and  listings  of  the  Lotus  123 
spreadsheets  appear  in  Appendix  D. 


Table  7.  ERROR  RATES  FOR  SELECTED  TRANSACTIONS 


TRANSACTION 

A68 

QCO 

300 

301 

328 

340 

1987 

3.10% 

12.41% 

4.64% 

5.15% 

13.61% 

9.41% 

1989 

1.20% 

8.26% 

9.50% 

mm a 

■am 

7.50"  o 

AVERAGE 

2.15% 

10.34% 

7.07% 

5.24% 

17.97% 

S.46% 

"P  denotes  the  conditional  probability  that  an  erroneous  transaction  is  properly 
rejected  by  one  of  the  edits.  Hence  1-  P  is  the  probability  of  the  Type  I  error  occurring.  " 
(Morey.  19S2,  p.339)  A  Type  I  error  is  defined  as  an  erroneous  transaction  that  slips 
through  the  edits.  While  Morey  does  not  describe  how  this  can  happen,  it  might  be 
because  there  arc  not  enough  edits,  the  edits  are  not  stringent  enough,  or  it  is  impossible 
for  edits  to  determine  that  the  transaction  is  in  error.  P  and  1  -P  can  be  estimated  by 
determining  the  actual  status  of  the  transactions  that  the  Enlisted  Research  Correction 
Section  turns  around  for  five  of  the  transactions.  Tor  the  QCO  (Availability)  trans¬ 
action.  the  Enlisted  Availability  Control  Branch  recorded  the  results  of  their  research 
efforts.  Selected  members  of  these  sections  tallied  the  status  of  the  rejected  transactions 
they  handled  on  the  worksheet  at  Appendix  E.  Lotus  123  spreadsheet  summaries  of  this 
data  appear  in  Appendix  F. 

“P>  denotes  the  conditional  probability  that  a  correct  transaction  is  improperly 
rejected  by  one  of  the  edits,  thereby  delaying  proper  updating  of  the  record.  P'  is  the 
probability  of  the  Type  11  error  occurring."  (Morey,  1982.  p.339)  A  Type  11  error  means 
that  a  correct  transaction  is  improperly  rejected,  and  1  -P'  means  that  the  correct  trans¬ 
action  was  correctly  processed.  The  probabilities  can  be  measured  in  the  same  manner 
as  those  above,  and  the  summary  data  for  this  parameter  also  appears  in  Appendix  F. 
In  Table  8,  the  estimated  values  for  P  and  P'  arc  displayed. 

"T  denotes  the  nonnegativc  random  variable  representing  the  time  interval  or 
spacing  between  transactions  for  a  given  record  of  the  type  being  analyzed;  these  arc 
further  assumed  to  be  independent  and  identically  distributed.  Candidates  for  T  might 
be  the  exponential,  uniform,  lognormal  random  variables  or  even  a  constant.  Also  let 
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Table  8.  VALUES  FOR  PROBABILITIES:  *  means  that  no  data  was  available 
for  this  transaction. 


TRANSACTION 

A68 

QCO 

300 

301 

328 

340 

P 

* 

.8772 

* 

.7500 

.5000 

.9583 

F 

* 

.1228 

* 

.2500 

.5000 

.0417 

ftr  denote  the  mean  of  the  intertransaction  times."  (Morey,  1982,  p.339)  Unfortunately, 
there  is  no  accurate  way  of  measuring  how  often  a  particular  transaction  of  a  particular 
type  is  applied  to  a  particular  record  in  the  EMF.  For  example,  a  QDO,  or  a  set  of  or¬ 
ders,  will  be  generated  on  intervals  of  the  members'  tour  lengths.  This  could  be  every 
two  to  five  years.  A  1FL,  a  gain  to  active  enlisted  strength,  occurs  only  once  in  the  life 
of  a  particular  record.  Ideally,  if  a  transaction  file  were  maintained  on-line,  it  could've 
been  queried  to  get  this  information.  Since  a  file  like  this  is  maintained  for  the  Officer 
Master  File  (OMF),  it  was  reasonable  to  assume  that  there  might  be  something  similar 
for  the  EMF.  However,  that  was  not  the  case.  In  retrospect,  even  if  there  were  a 
transaction  file,  it  would  probably  not  go  back  enough  years  to  do  this  t\pe  of  query, 
due  to  the  large  intertransaction  times  for  many  transactions  and  to  the  sheer  volume 
of  enlisted  transactions  which  are  processed.  The  intertransaction  times  for  these 
transactions  are  much  greater  than  those  for  the  transactions  Morey  studied,  however, 
he  did  not  propose  any  limits  on  these  times.  Since  this  parameter  could  not  be  meas¬ 
ured,  it  was  estimated.  These  estimates  appear  in  Table  9. 


Table  9.  INTERTRANSACTION  TIMES  FOR  SELECTED  TRANSACTIONS 


TRANSACTION 

A68 

QCO 

300 

301 

328 

340 

DAYS 

1460 

1080 

1460 

5475 

1460 

9000 

"C,  denotes  the  minimum  processing  time,  measuring  the  elapsed  time  from 
when  a  transaction  is  submitted  to  the  system  until  it  updates  the  record.  This  occurs 
for  a  transaction  not  rejected  by  any  of  the  edits."  (Morey,  1982,  p.339)  This  would  be 
from  one  to  three  days  in  most  cases.  In  the  SDS,  transactions  are  normally  up  loaded 
daily.  DMRS  transactions  could  take  longer  if  the  message  system  is  backlogged  with 
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higher  priority  communications  or  if  Minimize  3  is  imposed,  it  could  also  take  longer 
if  a  problem  with  format  causes  the  transaction  be  rejected  for  research  at  EPMAC, 
where  field  inputs  are  processed  for  transmission  to  NES.  DMRS  timeliness  information 
appears  in  Chapter  III,  however  these  statistics  are  based  on  the  transaction's  date  of 
occurrence.  Therefore,  they  take  into  account  reporting  delays  by  the  activity  responsi¬ 
ble  for  personnel  accounting  and  retroactive  transactions  which  constitute  a  correction. 
In  the  future,  timeliness  will  be  measured  for  each  increment  of  processing,  and  a  mean 
should  be  readily  available.  For  the  purposes  of  this  study  1.5  days  was  used. 

"C2,  a  constant,  denotes  the  additional  processing  time  delay,  over  and  above 
C„  to  manually  review  and  correct  transactions  which  (i)  were  in  error,  and  (ii)  were 
properly  rejected  by  the  edits."  (Morey,  1982,  p. 339)  This  was  measured  by  recording 
the  number  of  days  which  elapsed  between  when  the  transaction  was  first  rejected  and 
when  it  was  either  corrected  or  deleted  from  the  error  suspense  file.  The  figures  derived 
from  the  data  appear  in  Table  10.  See  Appendix  F  for  the  raw  data. 


Table  10.  TIME  FOR  TRANSACTIONS  TO  BE  CORRECTED/DELETED 


TRANSACTION 

A68 

QCO 

300 

301 

328 

340 

DAYS 

2.0 

1.2 

17.8 

3.0 

28.0 

10.8 

"C,.  a  constant,  denotes  the  additional  processing  time  over  and  above  C,  ,  to 
manually  review  and  allow  to  enter  into  the  system  any  intrinsically  correct  transactions 
which  were  rejected  by  the  edits.  It  is  assumed  that  the  reviewer  is  able  to  ascertain  the 
correct  situation  so  that  the  stored  record  is  updated  accurately."  (Morey,  1982,  p.339) 
This  information  was  obtained  from  the  Enlisted  Research  Correction  Section.  The 
figures  derived  from  the  data  appear  in  Table  1 1.  These  data  appear  in  Appendix  F. 


Table  11.  TIME  FOR  TRANSACTIONS  TO  BE  REINPUT:  *  means  that  no 
data  was  available  for  this  transaction. 


TRANSACTION 

A68 

QCO 

300 

301 

328 

340 

DAYS 

* 

1.9 

* 

9.0 

1.7 

2.0 

3  Minimize  is  when  message  traffic  to  a  particular  geographic  area  is  suspended  unless  it  is 
operationally  oriented. 
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2.  Determining  the  Stored  MIS  Error  Rate 

Using  the  parameters  described  in  the  previous  section,  an  estimate  for  er,  "The 
intrinsic  transaction  error  rate,  i.e.,  the  proportion  of  the  incoming  transactions  that  arc 
truly  in  error,"  can  be  obtained.  (Morey,  1982,  p.338)  The  equation  that  applies  is: 


eT- 


0  if  r  <  P' 
if  P'<r<P 
1  if  r>  P 


(I) 


However,  if  P  <  P*  then  the  formula  does  not  apply,  and  the  edit  should  be  eliminated, 
because  it  is  causing  the  rejection  of  more  correct  transactions  than  erroneous  ones 
(Morey,  1982,  p.340).  A  complete  explanation  of  the  derivation  of  this  formula  is  con¬ 
tained  in  the  appendix  of  Morey's  paper  (1982,  p.3-42). 

Next  Morey  defined  the  stored  MIS  error  record  rate,  e>,  as  "the  probability  that 
the  stored  record  is  in  error  for  any  reason.  It  is  defined  as  the  likelihood  that  a  ran¬ 
domly  chosen  record  (for  the  particular  record  type  of  interest)  examined  at  a  random 
point  in  time,  is  in  error.  It  includes  the  situation  where  a  change  in  the  record  has  oc¬ 
curred  but  has  not  been  updated  in  the  record."  (Morey,  19S2.  p.33S)  Notice  that  this 
definition  takes  into  account  the  concept  of  data  volatility,  mentioned  in  Chapter  II. 
Using  the  equation  below,  a  lower  bound  for  the  stored  MIS  record  error  rate  can  be 
found.  This  could  allow  NMPC-16  managers  to  assess  the  relative  accuracy  of  the  data 
applied  by  each  transaction.  It  also  could  help  the  Data  Management  Division  to  target 
available  resources  to  enhance  data  quality  more  effectively.  Morey  says: 

K,  >  *t<1  ~  n  +  IQ(1  -  er){l  -  P')  +  (Cj  +  C2)eTP  +  (C,  +  C3)(l  -  er)P']lnT  (2) 


Since  the  motivation  for  discussing  Morey's  paper  was  to  introduce  a  technique 
for  measuring  the  overall  stored  MIS  error  rate  of  the  EMF,  this  paragraph  explains 
how  that  could  possibly  be  achieved.  As  stated  before,  equation  (2)  gives  the  lower 
bound  for  the  stored  MIS  error  rate  of  data  elements  associated  with  a  particular  type 
of  record,  or  in  this  case  a  particular  type  of  transaction.  In  determining  the  overall 
stored  MIS  error  rate  of  the  EMF,  this  process  would  have  to  be  repeated  for  each  of 
the  130  plus  transactions,  and  then  that  figure  would  have  to  be  weighted  by  the  data 
elements  contained.  Remember,  each  transaction  contains  different  data  elements  and 
a  different  number  of  data  elements.  However,  a  particular  data  element  may  be  con¬ 
tained  in  more  than  one  transaction.  That  is  to  say  that  in  regards  to  data  elements  the 
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transactions  arc  not  mutually  exclusive,  but  are  collectively  exhaustive.  Perhaps,  further 
research  will  point  out  that  certain  transaction  volumes  are  low  enough  as  to  not  sig¬ 
nificantly  affect  the  overall  error  rate  of  the  EMF,  and  then  they  could  be  excluded  to 
reduce  the  effort  required  to  obtain  the  HMF's  error  rate. 

3.  The  Results  of  the  Analysis 

All  the  values  were  entered  into  the  equations  to  determine  if  an  acceptable  es¬ 
timate  of  the  stored  MIS  error  rate  could  be  found.  The  computed  values  for  <?v,  varied 
from  .02%  to  .19%,  extremely  low  error  rates.  Complete  results  appear  in  Appendix 
G.  In  two  cases,  for  the  QCO  and  301  transactions,  r  <  F,  therefore  equation  (1)  did 
not  apply  and  the  intrinsic  transaction  error  rate,  eT  ,  was  considered  to  be  zero,  rather 
than  a  computed  value  which  would  have  been  negative.  For  the  32S  transaction,  P  < 
F.  In  a  case  like  this,  Morey  says  that  the  intrinsic  transaction  error  rate  can  not  be 
estimated  by  his  equation  and  that  the  edits  should  be  dropped,  because  they  are  not 
performing  a  useful  function.  In  addition,  for  the  A68  and  300  transactions,  data  to 
determine  one  of  the  constants,  C3  ,  was  not  available.  Therefore,  the  value  for  C3  was 
assumed  to  be  zero.  This  leaves  only  one  transaction,  340,  for  which  all  conditions  of 
Morey's  formulas  were  met,  and  estimates  for  all  parameters  were  available.  The  para¬ 
graphs  below  explain  why  the  data  did  not  meet  the  conditions  of  Morey's  formulas,  and 
why  this  analysis  was  flawed. 

The  first  problem  was  that  insufficient  data  was  collected.  The  transaction 
counts  provided  in  the  NES  F/.M  statistics,  which  were  used  to  determine  how  long  to 
collect  data,  did  not  accurately  indicate  the  workload  being  handled  by  the  Enlisted  Re¬ 
search  Correction  Section.  This  is  because  some  transactions  arc  now  being  returned  to 
the  input  source  for  correction,  through  SDS.  Until  program  changes  are  made  on 
NOCS,  there  will  be  no  better  way  to  assess  what  errors  the  section  is  actually  correct¬ 
ing,  versus  what  erroneous  transactions  they  are  just  deleting  from  the  error  suspense 
file. 

Another  problem  occurred  because  the  value  of  r  was  based  on  almost  two 
complete  years  of  data,  while  the  sample  used  to  derive  F  was  very  small.  This  leads  to 
making  unrealistic  comparisons  between  r  and  F  Since  Morey's  article  did  not  explicitly 
state  how  he  determined  the  values  for  various  parameters  in  his  equations,  it  was  felt 
that  this  approach  was  sound.  The  only  other  alternative  would  have  been  to  track  the 
disposition  of  specific  transactions,  and  measure  every  parameter  from  the  same  sample. 
This  would  have  meant  collecting  all  of  the  data  from  the  Enlisted  Research  Correction 
Section,  rather  than  using  some  data  from  repo,  s  which  are  already  produced. 


54 


It  is  also  important  to  note  that  in  this  analysis,  where  a  parameter  could  not 
be  determined  from  available  or  collected  data,  it  was  estimated.  Since  nt  ,  which  was 
an  estimated  parameter,  was  very  large  in  comparison  to  the  values  Morey  used,  there 
is  some  question  as  to  whether  this  technique  is  appropriate  for  data  files  where  all  ele¬ 
ments  are  not  regularly  updated. 

Two  final  problems  influenced  the  results  of  this  study.  During  the  time  that 
this  data  was  collected,  there  was  some  suspicion  that  duplicate  tapes  were  entered  into 
processing.  Confusion  as  to  how  to  record  the  disposition  of  these  duplicate  trans¬ 
actions  complicated  matters.  In  addition,  the  directions  provided  to  those  gathering 
data  were  not  explicit  enough,  and  the  recording  sheet  was  unclear.  The  disposition  of 
transactions  should  have  simply  been  categorized  in  two  ways:  Correct  transaction  - 
reentered  as  is,  or  Incorrect  transaction  -  deleted  or  corrected. 


B.  RESOURCE  ALLOCATION  FOR  DATA  MAINTENANCE 

It  would  be  ideal  if,  in  addition  to  knowing  the  intrinsic  transaction  error  rates,  those 
rates  could  be  used  to  allocate  resources  for  data  maintenance.  Ballou  and  Kumar. Taxi 
developed  an  integer  program  that  does  just  that.  However,  the  model  requires  a  great 
deal  of  data  which  are  not  easily  attained  in  the  NTS  environment.  A  description  of  the 
integer  program  and  the  difficulties  experienced  in  trying  to  apply  it  are  explained  in  the 
paragraphs  which  follow. 

I.  The  Integer  Program  (IP) 

Maximize 

n  r  n 

ELI  ^k^kekfiM) 

i=l  /=!  _£=  I 

subject  to 


(3) 


X'j  =  0  or  1 ,  /  =  n.j  =  1 ....  ,  f(i) 


eu) 

Yux‘j- ii  =  1 


j=  1 


(5) 
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(6) 


According  to  Ballou  and  Kumar.Tayi: 
n  number  of  data  sets  S,  =  1.2 

P,  cost  incurred  to  the  organization  for  each  undetected  error  in  data  set  / 

N,  number  of  data  units  in  data  set  / 

e,  stored  data  error  rate  prior  to  maintenance  for  data  set  /' 

P„(k)  effectiveness  (i.e.,  ratio  of  number  of  errors  detected  to  total  number  of 
errors  )  on  data  set  A:  (A  =  1,  2, ....  n)  of  applying  maintenance  procedure  j  to  data  set 
i,j=  1,2....,/  (i).  (/  (i)  is  the  number  of  maintenance  options  available  for  data 
set  i  > 

c„  cos:  per  data  unit  of  applying  maintenance  procedure  j  to  data  set  i 
C„(k)  cost  per  data  unit  in  data  set  k  of  correcting  data  units  identified  as  defi¬ 
cient  as  a  result  of  applying  procedure  j  to  data  set  i 

F,j  fixed  cost  of  maintenance  procedure  j  on  data  set  i 

R  total  resources  (in  same  units  as  P )  available  for  data  quality  maintenance 
(Ballou,  1989,  p.322) 

For  a  more  complete  discussion  of  the  variables  and  the  model  formulation  see  the  paper 
by  Ballou  and  Kumar.Tayi  (1989). 

3.  Difficulties  in  using  the  IP 

The  IP  could  not  reasonably  be  applied  in  the  current  NFS  environment.  The 
cost  of  errors  is  difficult  if  not  impossible  to  quantify,  and  there  are  over  2700  possible 
errors.  A  sample  page  from  the  NES  error  listing  is  contained  in  Figure  8.  Even  if  these 
errors  could  be  grouped  into  appropriate  classes,  it  would  still  be  a  formidable  task  to 
quantify  each  class  for  each  transaction.  The  effectiveness  of  various  data  maintenance 
techniques  has  never  been  measured,  and  if  they  had  been  it  would  be  difficult  to  use 
these  quantities  in  an  equation  that  considers  them  indexes  of  the  same  thing.  The  rea- 
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son  for  this  is  that  the  effectiveness  of  transaction  error  correction  would  be  based  on 
data  sets  consisting  of  transactions,  just  as  was  done  when  using  Morey's  technique.  In 
this  case  some  data  elements  would  be  contained  in  more  than  one  data  set.  The  effec¬ 
tiveness  of  a  file  sweep  or  reconciliation  would  have  to  be  measured  based  on  data  sets 
which  would  be  made  up  of  data  elements.  Once  a  transaction  is  applied  to  the  file,  its 
structure  is  lost  and  cannot  readily  be  recovered.  Currently  the  data  sets  validated  in  file 
sweeps  do  not  exactly  match  any  of  the  transactions  contents.  As  discussed  in  Chapter 
III,  it  is  also  difficult  to  quantify  what  NMPC-16  is  spending  for  error  research  and 
correction,  much  less  for  data  maintenance  as  a  whole.  Therefore,  the  fixed  and  variable 
costs  of  data  maintenance  techniques  are  difficult  to  ascertain.  Even  R.  the  value  of  the 
total  resources  available  for  data  quality  maintenance,  is  not  readily  available,  as  some 
of  those  functions  are  being  performed  by  several  MPT  organizations,  including  NMPC, 
EPMAC,  and  NFC.  All  of  these  factors,  and  undoubtedly  others  which  have  yet  to  be 
identified,  combine  to  make  this  task  too  difficult  to  undertake  within  the  scope  of  this 
study. 


C.  CHAPTER  SUMMARY 

Efforts  to  apply  quantitative  techniques  to  the  complicated  NFS  environment  were 
not  successful,  but  much  was  learned  in  the  process.  These  lessons  learned  could  help 
in  the  future,  by  providing  essential  background  cn  what  data  to  gather,  and  how  to 
gather  them.  When  collecting  data,  directions  to  data  gatherers  (who  are  in  most  cases 
junior  personnel)  need  to  be  precise  and  clear.  For  example,  the  categories  for  disposi¬ 
tion  of  transactions  should  have  been  easier  to  distinguish.  Samples  should  not  be  col¬ 
lected  during  a  period  of  flux,  such  as  when  software  modifications  are  being  conducted, 
or  when  the  disposition  of  errors  is  being  changed.  Ambiguities  in  the  application  of 
new  techniques  should  be  clarified  before  data  are  collected.  For  example.  Morey  pro¬ 
posed  no  upper  bound  on  intertransaction  times,  however  he  did  state  that  the  record 
type  was  one  which  would  receive  periodic  updates.  If  certain  elements  in  the  record 
never  change,  or  if  others  change  only  after  long  intervals,  does  that  mean  this  technique 
is  inappropriate?  In  addition,  collecting  one  parameter  of  an  equation  from  historical 
data,  and  another  from  anecdotal  data  docs  not  seem  to  be  appropriate.  The  assump¬ 
tion  was  that  it  would  allow  for  a  more  accurate  measure  of  overall  transaction  error 
rates,  but  in  come  cases,  the  large  sample  made  this  parameter  sufficiently  low  as  to  in¬ 
validate  the  use  of  the  estimation  equation.  Finally,  it  is  apparent  that  if  quantitative 
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N-1652  MAPMIS  1306-7262  TAC  ERR  MSG  CODES  AS  OF  JUM  20 
TAC  ERROR  DESCRIPTION 

300  2B  ENL  CODE  =  3R;  PRES  PG  MUST  BE  E1-E3  P 

300  2C  EMR  SCH/OTH  GREATER  THAN  TRANS  TERM  P 

300  2D  EMR  BR/CL  NOT  11,15.32  P 

300  2E  TRANS  DT  OF  OCCUR  NOT  >  EMR  EAOS  P 

300  2F  EAOS  CANNOT  EXCEED  EREN 

300  2G  EMR  PEBD  INVALID 

300  2H  TRANS  DOB  <  16  YRS  PRIOR  TO  EMR  PEBD 

300  2J  EAOS  EXP-NO  EXTENS  OR  INDIC  ON  FILE 

300  2K  EAOS  EXP-EXTENS  PRES-MAKE  OPERATIVE 

300  2N  EMR  PEBD  INVALID  YR-MO-DA 

300  2P  DT  OF  OCC  N=00-90  DAYS  PRIOR  EMR  EAOS 

300  2Q  EMR  BR/CL*  68,78;  LOSS  CODE  NOT*06,46  P 

300  2R  EMR  BR/CL  =  68;  ENL  CODE  NOT*  41  OR  51  P 

300  2S  EMR  BR/CL  =  78;  ENL  CODE  NOT*  11  OR  51  P 

300  2T  DT  OCC  NOT  3M0S-1YR  PRIOR  TO  EMR  EAOS 

300  2V  EMR  BR/CL  NOT  =  32,  68  OR  78  P 

300  2W  STAR,  EMR  NO.  OF  ENLISTMENTS  MUST  BE  1 

300  2X  EMR  BR/CL*  68,78;  TAC  BR/CL  MUST  BE  11  P 

300  3K  TRANS  PERS  LOSS  INVALID 

300  3M  DT  OCC  N=00-90  DAYS  PRIOR  EAOS/EREN 

300  30  EMR  NBR  ENL  N=  1-9,  A-F 

300  3R  EMR  ADSD  INVALID 

300  3S  XXEMR  CED  >  TAC  DT  OCC 

300  3T  EMR  BR/CL  N=  TAC  BR/CL  P 

300  3U  EDLN  REAS  =  RRR,  RRA,  OR  QCP 

300  4A  DT  OF  OCC  NOT*  00-90  DAYS  PRIOR  EMR  EAOS 

301  A1  INVALID  SOURCE  CODE  P 

301  A4  INVALID  NEH  NAME  P 

301  02  INVALID  SSN  RANGE  P 

301  03  INVALID  NAME  P 

301  IB  UNMATCHED  SSN  P 

301  1C  UNMATCHED  NAME  P 

301  2A  XXNAME  UNCHANGED 

327  A1  INVALID  SOURCE  CODE  P 

327  A4  INVALID  SEX  CODE  P 

327  02  INVALID  SSN  RANGE  P 

327  03  INVALID  NAME  P 

327  IB  XXUNMATCHED  SSN  P 

327  1C  UNMATCHED  NAME  P 

327  2A  XXTAC  SEX  =  EMR  SEX 

328  A1  INVALID  SOURCE  CODE 

328  A4  INVALID  AUTHORITY  CODE 

328  A5  INVALID  PRESENT  RATE 

328  A6  NEW  RATE  =  BLANK,  AUTH  N=  A,B,C,G 

328  A7  INVALID  NEW  RATE 

328  A8  XXSRC=6;  NEH  RATE  N=  APPRENTICE 

328  A9  INVALID  EFFECTIVE  DATE 

328  B2  INVALID  TIME-IN-RATE  DATE 

328  B 6  AUTH*0, NEW  CODE  N=  3600,5000,6000,78000 

328  B7  TIR  INVALID  FOR  EFFECTIVE  DATE  OF  TRANS 

328  Z2  INVALID  DATE  OF  OCCURRENCE 

328  02  INVALID  SSN  RANGE 

328  03  INVALID  NAME 

328  IB  UNMATCHED  SSN 

328  1C  UNMATCHED  NAME 

328  2A  TAC  EFF  DT  >  TAC  AS  OF  DATE 

328  2B  XXDECL  OF  RATE-NO  EMR  PROS  RATE 


Figure  8.  MAPMIS  TAC  Error  Codes 


techniques  such  as  these  are  to  be  properly  tested,  a  significant  commitment  in  time  and 
resources  will  have  to  be  expended  in  the  data  gathering  phase  of  the  endeavor. 


V.  DATA  QUALITY  IN  THE  EMF  OF  THE  FUTURE 

In  this  chapter,  recommendations  for  providing  enhanced  data  quality  in  the  inte¬ 
grated  data  base  of  the  future  are  proposed.  These  recommendations  are  explained 
based  on  the  Data  Quality  Initiates  framework  developed  in  Chapter  II,  and  are  com¬ 
piled  and  distilled  from  many  researchers'  philosophies  and  efforts.  These  include:  the 
concept  of  data  value  as  a  decision  driver  (Varley,  1969),  the  classification  of  data  quality- 
control  techniques  based  on  the  SDLC  (Brodie,  1980),  and  the  consideration  of  the 
probability  of  being  able  to  maintain  a  particular  data  item  before  deciding  to  collect  it 
(Davis,  1985). 

A.  BACKGROUND 

The  functions  of  quality  control  and  quality  assurance  have  typically  been  con¬ 
ducted  after  a  product  was  designed  and  produced.  This  has  been  true  in  disciplines  as 
diverse  as  manufacturing  and  software  development.  Fortunately,  managers  have  re¬ 
cognized  the  importance  of  ensuring  that  a  quality  product  is  produced,  and  have  started 
to  engineer  products  with  particular  quality  standards  in  mind.  In  this  way,  either 
quality  is  assured,  or  at  least  the  product  is  more  maintainable.  This  early  emphasis  on 
quality  has  been  used  in  industry,  through  initiatives  such  as  TQM,  and  it  has  been  used 
in  software  development,  thorough  methodologies  such  as  structured  analysis  and  de¬ 
sign.  At  the  same  time  that  information  science  has  been  making  advances  in  the  area 
of  software  quality,  it  has  also  begun  to  focus  on  information  as  a  strategic  resource  in 
business.  This  focus  has  been  fueled  by  development  strategies  such  as  Information 
Engineering.  Now  it  is  time  for  information  systems  planners  to  take  data  driven  strat¬ 
egy  a  step  further,  and  address  data  quality  throughout  the  SDLC.  This  is  particularly 
true  for  large-scale  MISs.  Data  quality  planning  can  be  done  by  assessing  the  value  of 
the  data  that  will  be  collected  and  the  resources  which  must  be  expended  to  maintain  the 
data,  and  by  making  an  early  decision  as  to  whether  there  are  sufficient  resources  to  keep 
that  data  at  the  desired  level  of  quality: 

The  relationship  between  cost,  priority,  data  worth,  and  error  detection  and  cor¬ 
rection  procedures  should  form  a  set  of  principles.  These  principles  will  enable  the 
system  manager  or  system  designer  to  provide  the  system  users  with  information 
that  is  more  accurate  and  more  usable  than  ever  before.  (Varley,  1969,  p.  138) 
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Admittedly,  implementing  this  strategy  is  easier  said  than  done,  and  such  problems  as 
being  able  to  do  better  economic  evaluations  will  have  to  be  resolved.  On  the  plus  side, 
data  integrity  languages,  DD/DS,  and  Operations  Research  (OR)  are  powerful  tools, 
which  properly  applied,  could  help  improve  data  quality  maintenance. 

B.  APPLYING  THE  FRAMEWORK  TO  THE  NEW  DATA  BASE 

In  this  section,  recommendations  on  how  to  handle  planning  for  quality  control  in 
the  data  base  of  the  future  will  be  provided.  These  recommendations  are  separated  into 
the  categories  presented  in  the  Data  Quality  Initiatives  framework,  based  on  when  in  the 
SDLC  they  will  be  used.  However,  since  these  initiatives  are  being  considered  before 
systems  implementation,  to  a  certain  extent,  they  are  all  efforts  to  engineer  data  quality, 
in  advance  of  the  system's  deployment. 

1.  Engineering  in  Data  Quality 

Plans  indicate  that  NMPC-16  will  load  the  IRE  with  the  IMPDB  logical  data 
model.  Though  the  data  elements  have  been  standardised,  more  user  input  should  be 
collected  and  added  to  the  meta  data.  A  group  of  users  of  MPT  data  should  be  formed 
to  classify  all  off  the  500  plus  data  elements,  in  terms  of  their  value  delivered  to  the  MPT 
organization  and  the  Navy  .  After  looking  at  just  NES  data,  and  trying  to  assess  its  value 
for  use  in  the  IP  mentioned  in  Chapter  IV,  it  is  apparent  that  this  is  a  large  task.  To 
quantify  the  value  of  each  data  element  monetarily  would  be  too  difficult.  However,  it 
is  feasible  for  the  data  to  be  ranked,  if  a  system  similar  to  that  used  for  security  risk  as¬ 
sessments  is  employed.  In  this  system,  level  one  data  could  be  those  data  of  strategic 
importance  to  the  Navy.  Level  two  data  would  be  data  of  significant  strategic  impor¬ 
tance.  and  level  three  data  would  then  be  of  moderate  or  minimal  importance.  Once 
these  classifications  are  established,  associated  accuracy  and  timeliness  tolerances  could 
be  assigned.  These  data  could  then  be  entered  into  the  IRE.  and  would  have  the  im¬ 
mediate  benefit  of  providing  explicit  priorities  for  data  quality  control.  It  would  also 
have  a  long  term  benefit.  Data  could  be  positioned  and  accessible  if  optimization  rou¬ 
tines  using  these  parameters  are  developed  and  implemented.  Such  prestaging  is  con¬ 
sidered  a  strategic  move  in  planning  for  the  information  technology  of  tomorrow.  I  he 
new  IRE  entries  would  be  similar  to  those  that  appear  in  Figure  9. 

2.  Software  and  Hardware  Quality  Controls 

In  moving  to  a  data  base  environment,  NMPC-16  will  take  advantage  one  of 
the  technological  improvements  that  became  widely  available  in  the  19S0s.  However, 
managing  the  quality  of  the  data  base  will  be  a  new  challenge.  Along  these  lines, 
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ITEM 


Pay-Grade 

DESCRIPTION 

The  pay-grade  in  which  the  member  is  currently  serving. 
PERMISSIBLE  VALLES 


El 

SR 

W1 

WO 

Ol 

ENS 

E2 

SA 

W2 

WO 

02 

LTJG 

E3 

SN 

W3 

wo 

03 

LT 

E4 

P03 

YV4 

wo 

04 

LCDR 

E5 

P02 

05 

CDR 

E6 

POl 

06 

CAPT 

E7 

CPO 

07 

RADM 

E8 

SCPO 

08 

RADM 

E9 

MCPO 

09 

VADM 

VALUE  CATEGORY 
Level  I 

ACCURACY  TOLERANCE 
.01% 

TIMELINESS  TOLERANCE 
30  Days 

EDIT/VALIDATION  CRITERIA 

Changes  of  only  one  pay-grade  are  permitted,  unless  in  association 
with  a  disciplinary  action  or  a  data  base  correction. 

Others  which  might  apply  .  .  .  (could  be  stated  using  a  data 
integrity  language). 

Figure  9.  Updated  IRE  Entry  for  the  Data  Element  Pay-Grade 

NMPC-I6  should  investigate  the  use  of  a  data  integrity  language.  These  languages  are 
experimental;  in  fact,  in  1986,  Date  considered  them  to  be  hypothetical  (1986,  p.446). 
Later,  the  DEC  system  called  Rdbl  VMS  became  available,  and  incorporated  some  of  the 
controls  that  Date  described  in  his  hypothetical  language  (1986,  p.437).  Using  a  scheme 
such  as  Date  purports:  "Note  therefore,  that,  a  transaction  can  be  regarded,  not  only 
as  a  unit  of  work  and  a  unit  of  recovery  and  a  unit  of  concurrency,  but  also  as  a  unit 
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of  integrity.''  (1986,  p.448)  (Date's  emphasis)  This  integrity  language  could  provide 
needed  data  quality  control  at  the  headquarters  and  field-input  levels. 

3.  Methods  Involving  Data  Capture 

Now  that  NMPC-16  has  moved  to  on-line  input  of  data  at  many  of  the  field 
activities,  what  further  enhancements  in  data  capture  can  they  incorporate  into  their 
planning?  First,  all  sites  need  to  be  brought  up  with  on-line  editing  of  interactive  input. 
SDS  must  be  fielded  overseas,  something  which  has  been  delayed  at  least  a  decade  since 
the  original  implementation  plan.  SDS  Afloat,  to  be  used  on  ships,  is  still  under  devel¬ 
opment.  The  afloat  system,  like  the  shore  system,  will  place  the  data  entry,  data  vali¬ 
dation,  and  even  the  error  detection  and  correction  at  the  input  source.  This  allows  for 
the  most  effective  data  quality  control  and  the  most  efficient  error  correction.  As  SDS 
Afloat  is  fielded,  redundant  systems  like  DMRS  should  be  eliminated.  This  standardi¬ 
zation  will  make  it  easier  for  the  sailors  and  civilians  who  report  pay/pcrsonnel  data  to 
learn  proper  administrative  techniques,  because  there  will  be  only  one  system  with  which 
to  cope.  At  present,  SDS  and  DMRS  are  not  covered  in  detail  at  PN  "A"  School,  due 
to  time-constraints  and  to  the  systems'  redundancy.  When  one  standard  system  is  in 
place,  it  will  be  easier  to  train  sailors  and  to  set  policy  for  common  data  quality  controls. 
However,  it  will  be  a  challenge  to  reduce  the  multiple  reporting  systems  to  a  common 
system  and  to  keep  multiple  copies  of  the  input  edits  in  synchronization  with  one  an¬ 
other. 

A  second  issue  regarding  improved  data  capture  is  one  that  VMPC  apparently 
has  not  considered.  For  ease  of  data  entry  in  the  field,  voice  technology  offers  a  quicker 
and  more  accurate  input  method  for  redundant  tasks,  which  is  characteristic  of  some 
field  inputs.  It  might  be  beneficial  for  the  Field  Personnel  Systems  Division, 
NMPC-167,  to  investigate  use  of  voice  technology  in  future  SDS  upgrades. 

4.  Methods  Involving  Error  Detection 

Error  detection  improvements  depend  on  many  of  the  things  previously  dis¬ 
cussed  in  this  chapter.  These  are  things  such  as  implementation  of  a  data  integrity  lan¬ 
guage,  cleaning  up  and  synchronizing  system's  edits,  and  better  training  personnel.  In 
addition,  in  the  new  data  base,  a  library  of  queries  to  detect  data  base  inconsistencies 
should  be  developed.  These  should  be  run  on  a  recurring  basis  to  identify  data  that 
though  accurate  when  originally  input,  is  volatile  and  no  longer  valid. 

5.  Methods  Involving  Error  Correction 

It  is  likely  that  most  error  correction  techniques  will  always  require  human 
intervention.  By  their  very  nature,  errors  are  exceptions  in  processing,  and  human  be- 
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ings  handle  exceptions  better  than  computer  systems.  However,  a  policy  regarding  error 
correction  responsibility  needs  to  be  developed.  Gradually,  more  errors  are  being  sent 
back  to  the  input  source  for  correction,  but  criteria  for  identifying  when  this  is  appro¬ 
priate,  and  when  it  is  not,  have  not  yet  been  established  by  N MFC- 16. 

It  would  also  be  valuable  if  statistics  regarding  the  disposition  of  errors  could 
be  collected  in  an  automated  fashion.  This  would  provide  better  data  for  a  quality  esti¬ 
mation  technique  such  as  Morey's.  This  data  may  have  to  be  collected  in  at  least  two 
systems,  N'OCS  for  in-house  corrections,  and  SDS  for  those  errors  turned-around  in  the 
field. 

6.  Methods  for  Assessing  Data  Quality 

N'MPC-16  needs  to  dedicate  some  resources  to  experimenting  with  ways  to  as¬ 
sess  data  quality  in  NES  and  in  its  other  corporate  data  bases,  which  will  eventually  be 
integrated.  Morey  and  others  have  proved  that  estimation  techniques  for  measuring 
data  quality  can  be  developed.  While  this  study's  attempt  to  apply  that  technique  was 
not  particularly  successful,  further  investigation  into  quantifying  data  quality  in  the  NES 
and  other  corporate  systems  is  warranted.  Assessment  is  a  component  vital  to  effective 
management  of  many  aspects  of  information  technology. 

7.  Resource  Allocation  Techniques 

Chapter  IV  briefly  mentioned  an  IP  to  allocate  resources  to  data  quality  main¬ 
tenance  techniques.  OR  is  being  used  in  many  disciplines,  to  consider  decision  parame¬ 
ters  too  complex  for  the  human  decision  maker  to  manage.  Faculty  and  students  of  the 
OR  department  at  the  Naval  Postgraduate  School  are  interested  in  Navy  applications 
for  modeling.  NMPC-16  should  team  up  with  OR  professors  and  thesis  students  to  see 
if  a  resource  allocation  model  more  appropriate  to  NES  or  its  future  environment  can 
be  developed.  The  model  proposed  by  Ballou  and  Kumar.Tayi  appeared  appropriate  for 
application  to  the  NES  data  maintenance  environment,  but  required  too  many  inputs 
to  make  its  use  feasible.  An  OR  student  many  be  able  to  take  some  of  the  parameters 
recommended  for  inclusion  in  the  meta  data  of  the  new  integrated  data  base  and  for¬ 
mulate  a  model. 

8.  Fitting  the  Recommendations  Together 

How  might  NMPC-16  use  these  recommendations  together  to  develop  a  pro¬ 
gram  for  data  quality  in  the  future?  Figure  10  provides  an  illustration  of  how  these  in¬ 
itiatives  can  fit  together,  creating  an  environment  where  data  quality  is  actively 
managed. 
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ESTABLISH  DATA  NEEDS 

Using  a  Data  Driven/ Information  Engineering  Methodology 

l 

ASSESS  DATA  VALU  E 
Using  a  Quality  Data  Strategy 

i 

Assess  resources  needed  to  maintain  data 

I 

Determine  if  resources  required  for  maintenance  exceed  data  value 

1 

DEFINE  META  DATA  REQUIREMENTS 
For  documentation,  for  data  validation,  for  data  quality  maintenance 
and  assessment 
1 

Build  DD/DS  (IRE) 

1 

Determine  data  values,  costs  of  errors,  accuracy  and  timeliness 
tolerances,  and  data  validation  rules 

i 

ENFORCE  VALIDATION  REQUIREMENTS 
Using  a  Data  Integrity  Language 

i 

DESIGN  STANDARDIZED  ON-LINE  SYSTEMS  FOR  DATA  CAPTURE 
Down-load  validation  schemes 
1 

DEVELOP  LIBRARY  OF  ERROR  DETECTION  ROUTINES 
Determine  intervals  to  run  these  to  eliminate  volatile  data 

I 

ESTABLISH  TECHNIQUES  FOR  DATA  QUALITY  ASSESSMENT 
Design  automated  wavs  to  collect  inputs  for  these  assessment  routines 

i 

ESTABLISH  TECHNIQUES  TO  ALLOCATE  DATA  MAINTENANCE 
RESOURCES 
i 

INITIAL  DATA  QUALITY  DESIGN  TASKS  COMPLETE 
Figure  10.  Managing  the  Quality  of  the  Personnel  Data  Base 

C.  CHAPTER  SUMMARY 

This  chapter  has  proposed  that  NMPC-16  dedicate  resources  to  exploring  several 
new  technologies  which  could  contribute  to  enhanced  data  quality  in  the  future.  It  is 
appropriate  that  these  be  considered  now,  as  the  GFR  for  the  integrated  personnel  data 
base  is  being  refined  and  SDS  Afloat  is  being  developed.  The  technologies  which 
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NMPC-16  should  investigate  include:  defining  data  values  with  data  integrity  languages, 
capturing  data  with  voice  technology,  and  allocating  data  maintenance  resources  with 
OR.  Bringing  these  technologies  into  the  mainstream  will  require  that  the  MPT  com¬ 
munity  assess  the  benefits  to  be  gained  from  accurate  information,  and  that  the  MPT 
IRM  managers  convince  resource  sponsors  that  capturing  these  benefits  is  a  high  pri¬ 
ority. 

Some  of  the  initiatives  proposed  here  may  be  too  ambitious,  too  costly,  or  for  one 
reason  or  another  may  not  fit  within  the  broader  strategic  goals  of  the  of  the  MPT  or 
MPT  IRM  communities.  However,  this  analysis  sought  to  take  a  new  look  at  an  old 
problem,  and  perhaps  germinate  an  idea  which  will  mature  into  a  valuable  program.  In 
this  ambitious  no-holds-barred  approach  to  solving  the  data  quality  problem,  an  attempt 
was  made  to  fit  recommendations  into  programs  already  articulated,  however  resource 
constraints  were  not  considered.  Admittedly,  MPT  IRM  managers  do  not  have  this 
freedom,  so  some  of  the  ideas  presented  here  may  seem  a  little  far-fetehed.  however:  "At 
resource  allocation  time,  the  difference  between  an  effective  strategic  initiative  and  a 
harebrained  scheme  is  razor  thin.  Only  after  the  passage  of  money  and  time  is  the  an¬ 
swer  obvious."  (Cash,  19SS,  p.  145) 
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VI.  CONCLUSIONS 


Solving  the  data  quality  problem  within  the  NFS  is  not  easy  for  three  reasons.  First, 
NMPC  has  already  effected  a  broad  range  of  measures  which  have  positively  influenced 
the  EMF's  data  quality.  Therefore,  common  concerns  like  improving  data  capture 
through  on-line  systems  have  already  been  identified,  and  improvements  have  been  ini¬ 
tiated.  Second,  a  study  of  current  research  and  new  technology  did  not  reveal  any  ge¬ 
neric  data  quality  improvement  schemes,  which  could  be  easily  applied  to  NFS. 
Information  professionals  have  not  devoted  much  attention  to  the  data  quality  issue, 
until  recent  years.  In  the  past,  maintenance  of  data  quality  has  been  restricted  to  cor¬ 
recting  rejected  inputs.  To  complicate  matters,  data  maintenance  problems  are  unique 
to  each  environment.  Also,  assessment  of  data  quality  has  typically  been  left  to  audi¬ 
tors.  However,  NMPC-16's  future  Chief  Information  Officer  of  cannot  affort  to  dismiss 
an  issue  which  is  one  of  the  users-  chief  concerns.  Third.  NFS  must  function  in  a  com¬ 
plex  environment  The  data  are  input  at  diverse  locations,  the  systems'  interfaces  are 
extensive,  the  users'  functional  requirements  are  elaborate,  and  proper  management 
control  is  difficult  to  achieve.  Davis  says  that  "The  ability  of  an  organization  to  main¬ 
tain  data  quality  depends  on  both  organizational  factors  and  data  factors."  ( 19S5,  p.61 1) 
These  are: 

1.  Length  of  error  effect  cycle 

2.  Regularity  of  measurement 

3.  User-provider  link 

4.  Provider  data  discipline 

5.  F.ase  of  verification 

(Davis,  1985,  p.6U) 

In  the  paragraphs  that  follow  a  brief  analysis  of  each  of  Davis's  criteria  will  point  out 
why  there  are  no  magic  solutions  to  the  data  quality  problem. 

The  length  of  the  error  effect  cycle  in  NFS  is  varied.  For  pay-related  data  items  the 
error  affects  the  sailor  almost  immediately,  or  at  least  within  15  days.  Therefore,  pay- 
related  errors  are  much  more  likely  to  be  corrected.  However,  other  errors  may  remain 
buried  in  the  file,  impacting  only  statistical  analyses  and  MPT  policy  formulation.  These 
types  of  errors  are  more  insidious  and  have  a  much  greater  potential  for  affecting  the 
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health  of  the  file  over  the  long  term.  While  it  is  true  that  pay-related  errors  can  cost  the 
government  money  if  they  are  not  identified,  in  most  cases,  improper  payments  are  later 
recouped.  It  is  more  difficult  to  assess  the  impact  of  errors  on  concerns  such  as  strength 
and  accession  planning. 

Another  factor  which  affects  the  MPT  IR.M  community's  ability  to  control  the 
quality  of  NES  data,  is  the  fact  that  its  accuracy  is  not  regularly  measured.  The  com¬ 
pleteness  of  the  data  is  reported,  and  timeliness  will  be  measured  with  more  accuracy  in 
the  future,  however  a  report  on  overall  data  quality  is  not  produced  on  a  recurring  basis. 
Assessments  are  made  only  when  certain  high  visibility  issues  surface,  for  example,  most 
recently  the  quality  of  eligibility  data  for  the  G.I.  Bill. 

The  user-provider  link  in  NES  is  not  very  strong.  The  primary  users  of  the  data  are 
headquarters  organizations.  The  data  providers  do  not  necessarily  feel  a  responsibility 
to  these  organizations,  and  sometimes  because  of  the  extensive  requirements  levied  by 
headquarters  on  the  already  over-tasked  field  activities,  an  adversarial  relationship  exists. 
This  relationship  is  improving,  because  SDS  provides  a  tighter  link  between  headquar¬ 
ters  and  the  field.  Now  headquarters'  data  can  be  reconciled  with  field  data  in  an  auto¬ 
mated  manner,  and  used  to  produce  reports  for  local  management  control.  Examples 
of  some  of  these  reports  include  projected  rotation  date  reports  and  end  of  active  obli¬ 
gated  service  reports.  The  user-provider  link  has  been  strengthened  for  those  activities 
that  have  SDS. 

The  provider  data  discipline  is  a  concern,  because  so  many  different  sites  input  data 
into  NES.  It  is  difficult  to  oversee  discipline  in  all  the  activities  which  must  report  per¬ 
sonnel  and  pay  data.  For  the  most  part,  the  shore  establishment  is  tightly  managed, 
Personnel  Support  Detachments  (PSD)  exist  expressly  for  the  purpose  of  providing 
personnel/pay  support.  However,  even  in  PSDs  there  are  competing  concerns,  such  as 
pay  days,  temporary  duty  processing,  transfers,  and  advancement  examination  cycles. 
It  seems  logical  to  speculate  that  these  pressures  only  multiply  in  a  operational  envi¬ 
ronment,  where  accurate  reporting  of  data  to  headquarters  is  a  minor  concern.  Training 
each  of  the  individuals  who  will  have  the  potential  to  impact  the  quality  of  NES  data  is 
a  formidable  task  at  best. 

Finally,  and  perhaps  the  most  significant  problem  in  achieving  better  data  quality 
in  the  EMF  is  the  difficulty  in  verifying  it.  The  only  way  to  accurately  validate  the  data 
is  to  go  to  the  individual  source  documents  and  check  them.  This  can  be  done  by  using 
the  paper  service  record  in  the  field  or  the  micro-fichc  record  at  headquarters.  Both  of 
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these  validation  procedures  require  manual  intervention,  and  are  time-consuming  at 
best. 

Evaluating  NES  with  respect  to  Davis's  factors  points  out  just  how  difficult  it  is  to 
identify  a  way  to  improve  data  quality,  which  will  make  a  significant  difference  and  will 
be  cost-effective.  However,  simply  correcting  errors  docs  not  represent  adequate  quality 
control  at  the  headquarters  level.  While  N  VI  PC- 16  has  introduced  initiatives  which  co¬ 
incidentally  improved  data  quality  in  the  NES  and  other  systems  they  manage,  these 
initiatives  were  not  undertaken  as  part  of  a  data  quality  control  plan. 

A.  IMPROVING  THE  DATA  MANAGEMENT  ORGANIZATION 

The  reorganization  of  19S8  has  not  achieved  the  changes  needed  to  improve  data 
quality.  This  reorganization  sought  to  make  XMPC-16  more  data-oriented,  and  even 
established  a  separate  Data  Management  Division.  Though  the  reorganization  was  of¬ 
ficial  in  April  of  1988,  many  of  the  division's  management  positions  were  filled  with 
acting  directors  or  left  vacant.  While  the  reasons  for  this  were  beyond  the  immediate 
control  of  local  managers,  it  has  hurt  the  development  of  the  Data  Management  Divi¬ 
sion  none  the  less. 

The  Head  of  the  Data  Management  Division  is  a  Navy  Captain  who  is  an  Acting 
Director.  He  is  double-hatted,  responsible  for  both  the  policy  and  implementation  sides 
of  data  quality.  This  Acting  director  should  be  left  in  charge  of  policy  setting;  it  is  val¬ 
uable  to  have  blue-jacket  influence  in  that  area.  However,  a  computer  specialist  should 
be  hired  to  run  the  XMPC/Implcmentation  side.  The  person  hired  to  fill  this  position 
should  have  a  background  similar  to  that  usually  associated  with  an  Electronic  Data 
Processing  (EDP)  Auditor.  This  person  could  then  be  tasked  with  working  the  more 
technical  data  quality  issues.  There  are  several  reasons  why  having  someone  with  an 
EDP  auditing  background  would  enhance  the  effectiveness  of  this  division: 

•  By  trade  an  auditor  is  oriented  toward  assessment,  they  are  trained  to  place  value 
on  the  role  that  assessment  plays  in  proper  management.  EDP  auditing  is  a  spe¬ 
cialized  field  which  attempts  to  measure  the  effectiveness  and  accuracy  of  various 
aspects  of  computer  systems. 

•  In  addition,  an  auditor's  second  responsibility  is  to  make  recommendations  for 
improvement  to  systems  they  review.  This  fits  in  with  the  primary  mission  of  the 
data  management  division,  which  is  to  identify  ways  to  maintain  and  improve  the 
quality  of  the  corporate  data  bases. 

•  Much  of  the  current  research  concerning  data  quality  has  been  conducted  by  au¬ 
ditors  and  has  been  published  in  journals  for  the  auditing  profession.  In  many 
cases,  it  appears  that  members  of  this  profession  are  more  familiar  with  the  con¬ 
cerns  of  measuring  and  maintaining  data  quality  than  ADP  professionals. 


Since  there  is  no  specific  job  series  for  EDP  Auditing  in  the  civil  service,  the  position 
description  would  have  to  be  written  from  the  Position  Classification  Standards  for  both 
the  Computer  Specialist  Series  GS-334,  and  the  Auditing  Series  GS-511.  It  would  be 
best  to  classify  the  position  as  a  GS-334  overall,  because  it  would  better  fit  the  structure 
of  the  organization.  In  addition,  positions  which  encompass  primarily  auditing  tasks  are 
controlled  by  the  Office  of  Personnel  Management,  and  cannot  be  filled  locally.  The 
position  should  be  established  at  the  appropriate  level,  and  titled  Computer  Specialist, 
rather  than  programmer,  analyst,  or  technician.  The  position  will  carry  with  it  signif¬ 
icant  responsibility.  The  priorities  of  the  division  will  be  based  in  large  part  on  what  the 
individual  who  fills  this  position  views  as  concerns,  both  with  managing  data  quality  in 
the  current  systems,  as  with  planning  for  enhanced  data  quality  in  future  systems.  This 
employee  will  have  to  possess  "Broad  knowledge  of  data  processing  methods,  equipment 
types,  systems,  applications,  and  management  principles.  .  .  ."  (Position  Classification 
Standard  for  Computer  Specialist  Series,  1980,  p.  1 20)  The  individual  must  have  a  com¬ 
prehensive  knowledge  of  technologies  which  might  impact  on  data  quality.  When 
wearing  the  auditing  hat  this  manager  "Develops,  coordinates,  and  issues  technical  audit 
guidelines  and  instructions  for  the  inspection  of  operation  and  support  programs  and 
systems  usually  at  the  installation  level."  (Position  Classification  Standard  for  Auditing 
Series,  GS-5U,  1982,  p.74)  "The  auditor  must  justify  critical  findings  and  sell  recomm¬ 
endations  improving  the  efficiency  and  effectiveness  of  agency  programs."  (19S2,  p.SS) 
The  Data  Quality  Program  Section  is  another  area  of  concern.  A  manager  for  this 
section  was  not  installed  until  November  of  I9S9,  over  a  year  after  the  reorganization 
took  effect.  Up  until  that  time  there  was  no  comprehensive  data  quality  improvement 
plan,  in  fact,  there  were  basically  no  plans  at  all.  Now  that  there  is  some  stewardship 
in  this  section,  perhaps  a  comprehensive  set  of  internal  controls  can  be  established  for 
maintaining  an  essential  corporate  resource,  personnel  data.  To  develop  these  internal 
controls,  the  Data  Quality  Program  Section  should  conduct  a  review  of  the  data  main¬ 
tenance  activities  performed  throughout  the  organization.  Particular  emphasis  should 
be  placed  on  those  still  being  done  in  the  Corporate  Data  Systems  Division.  Some  of 
these  activities  might  best  be  handled  elsewhere.  The  reports  and  queries  run  by  con¬ 
tract  personnel  in  that  division,  should  probably  be  handled  by  NMPC-1642C  or 
NMPC-1641E.  In  any  event,  the  bottom  line  is  that  the  only  data  analysis  work  cur¬ 
rently  being  done  for  the  EMF  appears  to  be  inadequate,  and  is  being  done  in  the  wrong 
division.  Steps  are  being  taken  to  resolve  this  issue,  but  it  will  definitely  benefit  the  new 
manager  of  the  Data  Quality  Program  section  to  review  the  findings  of  the  audit  reports 
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discussed  earlier  and  the  recommendations  of  this  thesis  when  restructuring  the  priorities 
of  the  division. 

Another  concern  with  the  new  organization  relates  to  the  responsibilities  of  the 
Customer  Support  Division  (N'MPC-163).  It  would  be  valuable  if.  until  another  as¬ 
sessment  technique  is  developed,  this  division  could  survey  major  users  with  regards  to 
data  quality.  This  could  provide  the  Data  Management  Division  with  feedback  to  help 
set  priorities  among  quality  improvement  efforts.  Transferring  error  correction  assist¬ 
ance  for  field  activities  and  individuals  to  the  Customer  Support  Division  should  also 
be  revisited.  This  was  previously  suggested  in  several  management  studies.  Perhaps  a 
field  liaison  office  could  be  established  within  the  Customer  Support  Division.  An  al¬ 
ternative  to  this  would  be  to  move  these  functions  to  the  Field  Personnel  Systems  Divi¬ 
sion  for  SDS  customers  and  to  EPMAC  for  DMRS  customers.  This  transferring  of  error 
correction  assistance  for  field  activities  and  individuals  to  other  divisions  will  be  benefi¬ 
cial  only  if  it  doesn't  require  a  duplication  of  the  expertise  already  available  in  the  Data 
Management  Division. 

B.  ALLOCATING  DATA  MAINTENANCE  RESOURCES 

Some  of  the  methods  currently  used  for  data  maintenance  arc  too  costly.  NM  PC- 16 
prints  Officer  Data  Cards  (ODC)  once  a  year  and  on  demand,  for  an  officer  to  verify  the 
personnel  data  contained  in  the  Officer  Master  File  (OMF).  In  the  past,  production  of 
a  similar  Enlisted  Data  Card  has  been  considered  and  rejected.  Producing  ODCs  once 
a  year  and  on  demand  is  inefficient.  Reducing  ODC  printings  might  free  enough  re¬ 
sources  to  implement  a  verification  scheme  for  enlisted  data.  Using  this  new  scheme,  a 
verification  record  could  be  sent  to  activities  via  SDS.  For  El  -  E6,  verifications  could 
be  conducted  when  the  member  reenlists.  For  E7  -  E9  and  officers,  verifications  could 
be  completed  six  months  prior  to  any  selection  board  action  on  behalf  of  the  member. 
This  would  save  the  funds  used  for  ODC  mailings  and  forms  production,  and  could  re¬ 
duce  the  ODC  correction  workload.  The  correction  workload  would  be  reduced,  because 
ODC  verification  could  be  coordinated  by  PSDs  and  activities  providing  personnel  sup¬ 
port,  so  officers  would  not  be  as  likely  to  send  in  corrections  to  accurate  data  that  they 
just  don't  understand.  These  resources  could  be  reprogrammed  into  developing  logic  for 
producing  the  validation  runs.  This  would  have  the  added  benefit  of  providing  a  com¬ 
mon  means  for  verifying  OMF  and  EMF  data  in  advance  of  their  integration.  The  Of¬ 
ficer  Distribution  Control  Report  (ODCR)  and  the  EDVR  in  their  current  form  will  also 
be  unnecessary  in  the  future.  When  SDS  is  fully  deployed,  the  purposes  of  the 
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ODCR/EDVR  should  be  reexamined  and  unnecessary  commitments  of  resources  should 
be  eliminated.  Any  changes  which  are  considered  should  be  evaluated  with  the  impact 
of  the  integrated  data  base  in  mind. 

VI  ore  effort  should  be  put  into  using  programs  like  what  one  N.VfPC  staff  member 
called  "bubble  up".  This  program  compares  names  and  social  security  numbers  of  offi¬ 
cer  and  enlisted  personnel  from  the  tape  that  produces  the  navy  locator,  and  prints  any 
pairs  which  are  exact  or  close  matches.  These  pairs  of  records  can  then  be  researched 
and  redundancies  between  the  files  eliminated.  Similar  programs  could  be  developed  to 
check  information  in  the  file  against  current  tables  or  to  identify  data  which  is  no  longer 
valid.  Routine  queries  such  as  these  should  be  used  more  often,  because  they  contribute 
to  improved  accuracy  and  arc  efficient. 

C.  BETTER  TOOLS  FOR  MANAGING  DATA  QUALITY 

NMPC  also  needs  to  develop  better  tools  for  managing  data  quality.  For  example, 
NMPC-16  has  discussed  keeping  on-line  NES  statistics  in  order  to  better  analyze  the 
reports  produced  in  the  daily  updates.  This  is  a  good  idea,  which  should  be  implemented 
immediately,  and  added  as  a  requirement  to  the  GFR  for  IV1PDB.  A  thesis  student 
could  probably  design  the  data  base  and  load  it  with  as  many  old  statistical  reports  as 
are  available.  In  addition,  another  thesis  student  could  interview  experienced  NES  pro¬ 
grammers  and  users  and  write  some  rules  for  an  Expert  System  to  analyze  the  report 
data. 

Standard  Operating  Procedures  (SOP)  should  be  developed  for  error  researchers. 
These  could  go  a  long  way  to  help  train  new  workers  or  to  serve  as  a  desk  guide  for  ex¬ 
perienced  researchers.  This  was  recommended  by  auditors  and  is  reiterated  here.  Much 
can  be  learned  by  documenting  procedures  and  solidifying  policy.  Research  priorities 
established  in  the  IRE  could  be  reinforced  in  this  SOP.  In  addition,  responsibilities  for 
error  correction  of  particular  transactions  needs  to  be  reevaluated.  Researchers  in  the 
Enlisted  Research  Correction  Section  are  deleting  many  of  the  transactions  which  ap¬ 
pear  in  NOCS.  This  is  because  they  are  being  sent  back  to  field  activities  for  correction 
via  SDS.  This  needs  to  be  resolved  in  software,  so  controls  over  deletions  in  NOCS  can 
be  established  and  enforced.  Benoit  suggests  that  "Adding  or  deleting  ertire  records 
from  the  error  {rotating]  file  is  [should  be]  avoided  .  .  ."  (Benoit,  1979,  p.27) 

Another  audit  recommendation  was  that  an  on-line  transaction  summary  be  kept  for 
NES  just  as  it  is  for  the  Officer  Personnel  Information  System.  This  on-line  information 


enhanced  the  quality  of  error  research  for  the  officer  system,  and  could  have  the  same 
effect  on  the  enlisted  system. 

The  IRf  needs  to  become  an  active  system.  Simply  storing  meta  data  only  increases 
awareness  and  articulates  standards.  It  does  not  enforce  MAPTlS-widc  standardization. 
If  validation  routines  could  be  generated  for  all  systems  from  the  IRK  or  a  DD/DS 
system,  this  would  greatly  improve  data  quality. 

finally,  a  way  to  assess  data  quality  and  to  allocate  resources  for  data  maintenance 
needs  to  be  developed  and  institutionalized.  Once  a  methodology  is  in  place,  it  can  al¬ 
ways  be  expanded  or  refined.  The  methodology  can  serve  as  a  baseline  to  help  managers 
in  decision  making  and  to  rationalize  the  allocation  of  resources  to  data  maintenance, 
for  example,  using  Morey's  method,  if  it  is  determined  that  er  is  too  large  then: 

Improving  the  quality  of  the  incoming  transactions,  i.c.,  reducing  eT  presumably 
could  be  accomplished  by  more  training  of  the  preparers  of  the  transactions,  use  of 
optical  character  recognition  (OCR)  equipment,  more  emphasis  on  the  care  exer¬ 
cised  in  preparing  transactions,  etc.  (Morey.  1982.  p. 34 1 ) 

This  quote  is  not  presented  to  advocate  the  use  of  OCR  or  any  other  improvement 
technique,  it  just  serves  to  point  out  that  by  evaluating  parameters  of  data  quality,  re¬ 
sources  can  be  targeted  at  problem  areas.  Several  more  examples  of  this  appear  below: 

•  "Reducing  the  cycle  time  for  processing  of  transactions,  i.c..  reducing  C,.  tiiis 
could  be  accomplished  by  batching  more  frequently  or  use  of  an  on-line  opera¬ 
tion."  (Morey,  1982.  p.3-4 1 ) 

•  "Reduction  of  the  time  required  for  manual  review,  researching,  and  correction 
of  rejected  transactions,  i.c. ,  reduction  of  C2  and  C.f  This  depends  upon  more 
and/or  better  trained  clerks,  as  well  as  improving  their  access  to  the  historical 
records  or  individuals  involved.  This  is  exactly  the  thrust  of  the  Navy's  new 
PASS  system  mentioned  earlier  where  there  is  to  be  one  single  location  for  each 
Navy  Person,  handling  all  payroll,  re-enlistment,  separation,  vacation,  etc.  is¬ 
sues.  This  improved  interface  will  facilitate  the  researching  and  correction  of  re¬ 
jected  transactions."  (Morey,  1982,  p.341> 

•  "A  tightening  of  the  edits  to  reduce  the  frequency  of  Type  I  and  Type  II  errors, 
i.c.,  to  increase  P  and  reduce  P" .  This  requires  more  precision  in  the  screens  used 
and  requires  a  careful  analysis  of  the  relative  advantages  and  disadvantages  of 
deleting  or  adding  edits."  (Morey,  1982,  p. 34 1 ) 

However,  in  moving  to  this  form  of  management  by  assessment,  cost  must  be  a  major 
consideration.  The  following  observation  was  made  in  regards  to  assessing  software 
quality:  "The  major  deterrent  to  incorporating  a  measurement  program  is  cost.  If  the 
cost  outweighs  the  benefits,  the  measurement  process  is  not  worth  pursuing."  (Valett. 


1989.  p.  1 37)  NMPC-16  will  have  to  weigh  the  costs  and  benefits  before  deciding  to  de¬ 
velop  techniques  for  data  quality  assessment. 

D.  SUMMARY 

Data  quality  is  difficult  to  define  and  difficult  to  quantify,  and  improved  data  quality 
is  difficult  to  obtain.  Many  initiatives  can  contribute  to  improved  data  quality,  but  these 
should  be  coordinated  in  an  overall  program.  Developing  a  comprehensive  data  quality 
program  is  not  a  simple  task.  In  order  to  improve  their  control  of  data  maintenance, 
W1PC-16  needs  to  survey  current  initiatives  which  contribute  to  data  quality,  and  de¬ 
cide  what  future  environment  is  desired.  Once  that  is  decided,  these  goals  should  be  ar¬ 
ticulated  in  a  data  quality  plan  which  could  include  many  of  the  initiatives  suggested  in 
Chapters  V  and  VI  of  this  study.  Overall,  NMPC-16  has  been  very  proactive  in  striving 
to  provide  accurate  and  timely  data  to  the  MPT  community.  The  initiatives  suggested 
in  this  thesis  are  merely  refinements  to  a  program  which  is  already  on  the  right  track. 
It  would  have  been  easier  to  make  suggestions  for  improving  data  quality,  if  initiatives 
such  as  SDS  and  the  IRT  were  not  already  started.  However,  the  survey  of  Data  Quality 
Initiatives  conducted  in  this  study  pointed  out  some  weak  spots,  and  can  provide  a  val¬ 
uable  assessment  tool  for  other  organizations  evaluating  their  data  maintenance  envi¬ 
ronment.  Since  the  data-oriented  reorganization  is  less  than  two  years  old.  the  SDS 
Afloat  and  the  IMPDB  are  under  development,  and  some  parts  of  the  NTS  are  sched¬ 
uled  for  recoding,  now  is  an  ideal  time  for  NMPC-16  to  take  the  lead  in  addressing  data 
quality  control  as  an  IRM  concern. 
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APPENDIX  A.  LIST  OF  ACRONYMS 


ADP  -  Automated  Data  Processing 
CDC  -  Consolidated  Data  Center 

CIRMP  -  Component  Information  Resources  Management  Plan 

CNO  -  Chief  of  Naval  Operation 

CNP  -  Chief  of  Naval  Personnel 

DBMS  -  Data  Base  Management  System 

DD/DS  -  Data  Dictionary/Directory  System 

DDP  -  Distributed  Data  Processing 

DE  -  Data  Element 

DMRS  -  Diary  Message  Reporting  System 

DON  -  Department  of  the  Navy 

EAM  -  Electronic  Auditing  Machinery 

EDP  -  Electronic  Data  Processing 

EDY'R  -  Enlisted  Distribution  Verification  Report 

EMF  -  Enlisted  Master  File 

EPMAC  -  Enlisted  Personnel  Management  Center 

F/M  -  File  Maintenance 

GFR  -  General  Functional  Requirements 

GS  -  General  Schedule 

I  BA  -  Information  Benefit  Analysis 

IMPDB  -  Integrated  Military  Personnel  Data  Base 

I/O  -  Input/Output 

IP  -  Integer  Program 

IR  -  Information  Resource 

IRE  -  Information  Resources  Encyclopedia 

IRM  -  Information  Resources  Management 

I RSTRATPLAN  -  Information  and  Related  Resources  Strategic  Plan 

JUMPS  -  Joint  Uniform  Military  Pay  System 

EDM  -  Logical  Data  Model 

I.ES  -  Leave  and  Earnings  Statement 

MIS  -  Management  Information  System 


MAPTIS  -  Manpower,  Personnel,  and  Training  Information  System 

MOL'  -  Memorandum  of  Understanding 

MPT  -  Manpower,  Personnel,  and  Training 

NUC  -  Navy  Unlisted  Classification  Code 

XUS  -  Navy  Unlisfcd  System 

NFC  -  Navy  Finance  Center 

XMPC  -  Naval  Military  Personnel  Command 

NMPC-  16  -  Total  Force  Information  Systems  Management  Department 

NMPC-163  -  Customer  Support  Division 

NMPC- 164  -  Data  Management  Division 

NMPC- 1 641  -  Corporate  Data  Maintenance  Branch 

XMPC-1641U  -  Unlisted  Research  Correction  Section 

NMPC- 1642  -  Data  Implementation  Branch 

NMPC-1642C  -  Data  Quality  Program  Section 

NMPC- 165  -  Corporate  Data  Systems  Division 

NMPC- 1 66  -  Field  Personnel  Systems  Division 

NMPC-167  -  Technology  Support  Division 

NOCS  -  NFS  On-line  Correction  System 

NRA  -  Navy  Recruit  Accession 

OCR  -  Optical  Character  Recognition 

ODC  -  Officer  Data  Card 

ODCR  -  Officer  Distribution  Control  Report 

OMF  -  OfTicer  Master  File 

OP-16  -  Total  Force  Information  Resources  and  Systems  Management  Division 

OR  -  Operations  Research 

OSD  -  Office  of  the  Secretary  of  Defense 

PSD  -  Personnel  Support  Detachment 

RAS  -  Resource  Accounting  System 

SDLC  -  Systems  Development  Life  Cycle 

SDS  -  Source  Data  System 

SUCNAV  -  Secretary  of  the  Navy 

SOP  -  Standard  Operating  Procedures 

TAC  -  Transaction 

TQM  -  Total  Quality  Management 
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APPENDIX  B.  SUMMARY  OF  NES  TRANSACTION  STATISTICS 


TRANSACTIONS  AND  ERROR  RATES  PER  MONTH 
YEAR  TOTAL  OVERALL  TAC'S  TAC'S 

AND  MONTH  TRANSACTIONS  TAC  INPUT  FOR  N1652 

OF  REPORT  PROCESSED  ERROR  RATE  BY  N1652  RESEARCH 


8512 

764816 

8601 

502312 

8602 

725094 

8604 

560024 

8606 

763363 

8607 

631439 

8608 

576737 

8609 

561344 

8610 

508379 

8611 

719261 

8612 

568107 

8702 

703976 

8703 

533534 

8704 

695772 

8706 

563128 

8707 

645609 

8708 

589549 

8709 

631230 

8710 

609015 

8711 

532247 

8712 

569784 

8802 

728831 

8803 

505411 

8809 

649806 

8902 

452770 

8903 

609843 

8904 

614925 

8905 

611708 

8907 

692946 

8908 

595520 

8909 

698265 

8910 

556664 

AVERAGES 

614732 

SUMS 

19671409 

YEARLY  AVG 

7376778 

21 

4794 

25290 

24.8 

2881 

5592 

19.9 

6538 

16976 

20.5 

12686 

25951 

14.9 

17953 

21814 

12.9 

5892 

14095 

10.5 

13612 

10573 

8.7 

5182 

6826 

11 

5282 

8598 

13 

6767 

12932 

13.2 

14025 

11036 

19.5 

6888 

7940 

22.8 

3944 

6925 

6.9 

4991 

7965 

7.5 

5109 

17966 

12.5 

15519 

54530 

10.2 

7110 

36926 

19.7 

35478 

51870 

7.7 

10529 

27493 

10.5 

11755 

34350 

8.4 

19529 

27021 

7.5 

40862 

31924 

7.5 

10191 

17000 

6.6 

5297 

22616 

6.2 

10640 

13455 

6.1 

112090 

16145 

4.9 

41856 

14527 

6.3 

92849 

9635 

6 

27990 

20133 

8 

34375 

22281 

4.6 

78487 

14884 

6.4 

13350 

21480 

11.4 

21389 

19898 

684451 

636749 

256669 

238781 
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APPENDIX  C.  TABLE  OF  NES  TRANSACTIONS  (TAC) 


l  AC 

INSCRIPTION  OR  PURPOSF  ^ 

A62 

Almn  previous  rating  (enlisted  rank  and  specialty)  with  that  in  the  member  s  service  record. 

A  68 

1.  pdate  or  correct  prospective  rate. 

BUI 

Input  or  change  a  code  indicating  what  degree  commissioning  program  member  is  participating  in. 

B29 

Align  the  basic  test  battery  scores  with  those  in  the  member  s  service  record. 

B48 

Input  or  change  a  language  ability. 

B55 

Input,  change,  or  correct  security  data. 

B77 

Change  the  six  year  obligator  code  for  those  enrolled  in  an  Advanced  Electronic  F'ield. 

CAC 

Build  a  skeleton  record  on  a  member  who  enlisted  in  the  Delaved  Entry  Program. 

CAD 

Delete  members  who  have  enlisted  tn  the  CACHE  program  and  not  reported  or  lost  those  reported  erro¬ 
neously  by  U  SAREC. 

C03 

Align  reserve  contract  extension  with  that  in  member  s  service  record. 

Cu4 

Input  an  estimated  date  of  loss  to  the  Navy  and  a  reason  for  loss. 

C2I 

Align  type  of  enlistment  and  type  of  acquisition  data  with  that  in  the  member  s  service  record. 

C24 

Align  Military  Obligation  Designator  data  with  that  m  the  members  service  record 

maxm 

Correct  the  number  of  enlistments  and  the  member’s  Branch  and  Class  :  service. 

C26 

Input  the  reason  member  is  retained  on  file  beyond  the  end  of  their  active  obligated  service. 

C32 

Change  the  member  s  date  of  birth. 

09 

Change  or  correct  the  member  s  citizenship. 

C40 

Align  religion  with  that  in  member  s  service  record. 

C41 

Align  home  of  record  data  with  that  in  member  s  service  record. 

C43 

Correct  place  of  birth. 

C70 

Correct  AFQT  score. 

ms 

Update  type  and  date  of  last  discharge. 

FAIR 

Provide  FMF  record  for  research. 

FTP 

Update  special  program  or  ship  data. 

F;u7 

Update  the  career  history  fields. 

F38 

Input  or  correct  school  history. 

F45 

Change  or  correct  any  one  of  five  entries  for  school  history. 

F.77 

Align  ethnic  group  designation  with  that  in  the  member  s  service  record. 

F8S 

Correct  the  special  program  code. 

F89 

Input  or  correct  the  program  availability  code. 

F.99 

Correct  the  education  iV.d. 

FBK 

Report  action  taken  by  NMPC  on  errors  from  NFC. 

FFF 

Correct  DOD  A  FEES  code. 

CIB 

Input  or  change  G.l.  Bill  Eligibility  data. 

MOB 

Enter  mobilization  gains. 

N  FA 

Change  Navy  Enlisted  Classifications  (NFC). 

NFB 

Add  one  valid  NFC  earned  through  on  the  job  training. 

NFF 

Add  or  delete  an  NFC  earned  through  school. 

NFC 

Permits  NFC  to  report  errors  on  pay  transactions. 

NSP 

Correct  the  special  program  indicator. 

OFF 

Change  the  Success  Chances  for  Recruits  F;ntering  the  Navy  Code. 

PA  1 

Purpose  is  to  input  the  date  Privacy  Act  data  was  contested. 

PA2 

Delete  the  date  Privacy  Act  data  was  contested. 

QAB 

Enter  the  test  scores  from  the  Special  Assignment  Battery. 

QA  P 

Enter  the  Recruit  Assistance  Program  data 

QA2 

Chance,  delete,  or  add  detailers  information. 
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DESCRIPTION  OK  PLKPOSE 


Change  the  distribution  NP.C. 


E  pdate  data  concerning  overseas  assignment  requests. 


Remove  a  flag  before  detailer  makes  an  assignment. 


Input  a  lias  alter  a  detailer  makes  an  assignment. 


Delete  an  availabilitv. 


Change  special  case  codes. 


Change  the  Nuclear  Held  Indicator. 


Obtain  an  assignment  document. 


Field  to  allow  command  turbulence  to  be  monitored. 


Inform  cetailers  that  members  arc  available  for  immediate  as 


Inform  detailers  that  students  are  available  for  1m 


Establish  a  set  of  transfer  orders. 


Provide  prospective  fleet  reserve  ireliremenl  from  active  dulvi  information. 


Cancel  a  set  of  transfer  orders. 


I  pdate  Ol  ARl)  reenlistment  data. 


Reprint  a  set  ot  orders. 


Cancel  a  nreviouslv  recorded  tour  extension. 


Change  or  correct  the  Assigned  Rate. 


Change  the  on  board  orders  cost  data  or  permanent  cnange  of  station  data. 


Change  the  protected  rotation  date  and  protected  rotation  reason. 


Report,  change,  or  correct  F.S  through  E-v  evaluations. 


E2EES23SES31 


Print  or  delete  and  evaluation 


I  o  enter  a  tour  extension. 


E r.ter  a  prospective  gain  and  distribution  data. 


Modify  previously  recorded  duty  preference  information. 


nrr.ont  data. 


iMwuugjmRUBiii.minjm.waB 


Enter  NMPC  Code  number  of  the  detailing  office  responsible  for  members  as?:. 


Enter  duty  preferences  as  submitted  by  members. 


Record  selective  reenlistment  bonus  ua'.a. 


Identity  personnel  who  require  special  consideration  for  assig.nme: 


tion  ,  advanced  electronic  field,  ar.d  lateral  conversion  data. 


ar  obligation  indicator. 


Enter  the  military  spouse  identifier. 


Change  the  sea  dutv  commencement  date. 


Change  the  special  category  code. 


Change  the  shore  dutv  commencement  date. 


Delete  a  complete  record  lof  a  certain  type). 


Input  the  Social  Security  Administration  verifv  key  and  wage  status  indicator. 


Csed  to  correct  data  elements  as  neccssarv 


Correct  term  status  code. 


Add  or  remove  a  special  interest  code. 


Report  members  who  are  unauthorized  absentees,  deserters,  or  in  civilian  custody. 


Process  NEC  data  which  adjusts  EME'  elements  due  to  member  s  unauthorized  absence. 


Process  members  who  are  accessions  to  enlisted  strength  via  M  l  I  S 


Process  active  dutv  members  to  full  strength. 


Establish  a  skeleton  master  record. 


Cancel  a  erroneously  applied  strength  loss  and  restore  to  mas'er  Die. 


Gain  member  on  board  an  activity. 


Cancel  an  activity  loss  processed  in  error,  and  reinstate  member  on  board. 


Process  those  members  who  have  reeniisted  within  c-i  hours  after  discharge 


Change  or  correct  a  member  s  name  to  agree  with  official  documents. 


TAC 

327 

328 

330 

331 

333 

334 
336 
338 
340 


341 

344 

345 


352 

355 

356 
350 
362 
376 
378 
370 

382 

383 

385 

386 

387 

388 
380 
300 
6\\ 
7\\ 
708 
8\\ 
8\\ 
051 
006 
008 


DESCRIPTION  OH  PIKPOSF _ 

Correct  the  member  s  sex  to  agree  with  official  record. _ 

1  pdate  present  rate. _ 

Process  or  chance  Total  Obligated  Submarine  Sersice  data. 

Correct  or  modify  submarine  pay  data. _ 

Report  or  change  special  qualification  not  identifiable  by  N  PC  or  rate. _ 

Change  or  correct  member's  branch  and  class  of  service  to  agree  with  the  service  record 

Align  dependency  status  with  NFC  and  service  record. _ 

Change  the  primary  NEC  on  non-rated  personnel. _ 

Forward  to  NFC  all  guilty  courts  martial  findings,  all  \JP  s  which  affect  pay  and  rate,  administrative 
actions,  or  restoration  of  above. 

Apply  data  from  NFC  regarding  above. _ 

Process  SSN  changes. _ 

Record  the  number  of  dependents  residing  overseas  with  member  attached  to  an  overseas  station  or  shi 
home  ported  overseas. _ _ _ 

Process  changes  to  population  group. _ 

Process  proficiency  pav  additions  or  changes. _ 

Process  changes  which  reflect  the  reason  for  unavailability  of  members  for  certain  types  of  duty. 
Process  active  duty  service  date  changes. 

Change  a  date  on  which  mem  her  was  received  to  a  command. 

Modify  a  accounting  category  code. _ 

Process  executed  reserve  contract  extensions. _ 

Align  active  duty  obligation  data  with  the  member  s  service  record. _ 

Process  a  LSNR  agreement  to  remain  on  active  duly  or  a  L  SN  agreement  to  ev.end  an  enlistment. 

Process  an  operative  extension  of  a  l  S\  member  s  enlistment, _ 

Process  the  number  of  months  of  involuntary  extension  of  a  member  s  active  duty  obligation. _ 

Process  a  cancellation  of  3  previously  executed  extension  of  enlistment  for  l  5N  or  1  SNR  members. 

Make  operative  a  previously  executed  agreement  to  extend  the  active  duty  of  LSN  members. _ 

Cancel  a  previously  executed  agreement  to  extent  active  duty  for  LSNR  members _ 

Make  operative  a  previously  executed  agreement  to  extend  a  I.SNR  member  s  enlistment. _ 

Process  a  correction  to  a  member's  pav  entry  base  date. 

Apply  activity  losses  to  the  on  board  or  past  activity  data. 

Same  as  above. _ 

Cancel  an  activity  gam  reported  in  error. 

Process  strength  losses  from  the  Naval  Service _ _ _ 

Same  as  above.  _ _ _ _ _ 

Process  data  which  identifies  those  members  who  are  considered  deserters. 

Align  loss  data  m  NFC  and  NMPC  tiles. _ 

Remove  members  who  have  been  sained  to  active  Naval  strength  in  err  >r. 
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APPENDIX  D.  INDIVIDUAL  TRANSACTION  ERROR  RATES 


1989 

MONTH 

JAN 

FEB 

MAR 

APR 

MAY 

TAC 

INPUT 

A68 

477 

3547 

4958 

27466 

5543 

ERROR 

32 

298 

102 

311 

528 

N1652 

126 

655 

211 

160 

107 

RESEARCH 

29 

298 

60 

241 

528 

TAC 

INPUT 

QCO 

9065 

9856 

9773 

8871 

7320 

ERROR 

280 

1159 

733 

360 

318 

N47 

1521 

2035 

1325 

1294 

1838 

RESEARCH 

173 

639 

476 

243 

129 

TAC 

INPUT 

300 

4651 

4282 

4689 

4403 

5172 

ERROR 

226 

222 

308 

236 

239 

N1652 

135 

208 

408 

249 

217 

RESEARCH 

78 

91 

167 

135 

124 

TAC 

INPUT 

301 

1320 

827 

867 

1349 

914 

ERROR 

56 

21 

149 

45 

24 

N1652 

12 

18 

49 

57 

17 

RESEARCH 

55 

21 

19 

45 

24 

TAC 

INPUT 

328 

35576 

34884 

32564 

30866 

36769 

ERROR 

2467 

6193 

5795 

3482 

3212 

N1652 

731 

6529 

5355 

1364 

891 

RESEARCH 

2459 

6182 

4593 

3467 

3208 

TAC 

INPUT 

340 

4331 

2987 

4162 

4286 

2864 

ERROR 

325 

247 

532 

482 

229 

N1652 

22 

67 

77 

61 

54 

RESEARCH 

325 

247 

532 

482 

229 

JUN 
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JUL 

AUG 

SEP 

OCT  NOV 

DEC 

TOTAL 

ERROR% 

10281 

9887 

38602 

8367 

109128 

3.10% 

64 

1437 

193 

423 

3388 

192 

169 

155 

2692 

4467 

64 

1429 

191 

420 

3260 

14182 

11830 

9252 

9237 

89386 

12.41% 

4721 

2522 

496 

503 

11092 

2536 

2431 

1802 

1662 

16444 

4190 

2209 

279 

226 

8564 

6509 

5507 

5917 

6127 

47257 

4.64% 

263 

229 

255 

216 

2194 

280 

254 

321 

305 

2377 

115 

100 

114 

94 

1018 

1189 

1048 

1404 

1380 

10298 

5.15% 

32 

40 

75 

88 

530 

22 

37 

37 

55 

304 

32 

40 

75 

88 

399 

34899 

33310 

37651 

52508 

329027 

13.61% 

3842 

2935 

3916 

12950 

44792 

2211 

1314 

1231 

1373 

20999 

3828 

2918 

3884 

12900 

43439 

4017 

3401 

3366 

3545 

32959 

9.41% 

386 

264 

268 

368 

3101 

62 

71 

75 

78 

567 

386 

264 

268 

368 

3101 
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1987 

MONTH 

TAC 

INPUT 

ERROR 

N1652 

RESEARCH 

A68 

TAC 

INPUT 

ERROR 

N47 

RESEARCH 

QCO 

TAC 

INPUT 

ERROR 

N1652 

RESEARCH 

300 

TAC 

INPUT 

ERROR 

N1652 

RESEARCH 

301 

TAC 

INPUT 

ERROR 

N1652 

RESEARCH 

328 

TAC 

INPUT 

ERROR 

N1652 

RESEARCH 

340 

FEB 

MAR 

APR 

MAY  JUN 

4096 

189 

34772 

49113 

21 

8 

73 

154 

168 

61 

84 

48 

19 

8 

73 

131 

10615 

10058 

10240 

8098 

501 

1582 

656 

336 

1236 

2113 

1174 

1023 

220 

309 

537 

171 

4300 

4636 

7034 

3882 

266 

281 

457 

233 

136 

123 

214 

133 

266 

281 

457 

233 

1275 

851 

1232 

64' 

47 

39 

44 

61 

41 

23 

32 

:  1 

41 

36 

44 

37 

28397 

30650 

38347 

21228 

1344 

1531 

1960 

1321 

2622 

765 

1522 

893 

1335 

1511 

1949 

1316 

4464 

5633. 

6486 

4446 

288 

362 

411 

321 

288 

362 

411 

321 
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JUL 

AUG 

SEP 

OCT 

NOV 

DEC 

TOTAL 

ERROR% 

1434 

8473 

7167 

27442 

248 

89372 

222306 

1.20% 

141 

1172 

27 

350 

21 

691 

2S58 

193 

105 

113 

797 

91 

71 

1731 

139 

1171 

27 

.  311 

21 

669 

2569 

9860 

8215 

11079 

.10580 

8604 

9060 

96409 

8.26% 

303 

391 

3200 

348 

310 

334 

7961 

1166 

998 

840 

1324 

853 

946 

11673 

202 

292 

3113 

206 

239 

189 

5478 

4180 

3965 

2784 

4189 

5129 

3112 

43211 

9.50% 

240 

260 

181 

331 

1522 

335 

4106 

132 

172 

174 

190 

1063 

244 

2581 

240 

260 

181 

331 

1510 

279 

4038 

1333 

952 

974 

1125 

923 

986 

10298 

5.33% 

50 

32 

28 

49 

81 

118 

549 

37 

25 

31 

27 

58 

39 

337 

49 

32 

25 

49 

81 

114 

508 

83741 

37560 

9959 

37549 

37083 

15685 

828 

15677 

39981 

11459 

5327 

11428 

56466 

9778 

3082 

9708 

29054 

3448 

2238 

3423 

25570 

3072 

1602 

3045 

390517 

87158 

28838 

86941 

22.32% 

5053 

433 

5193 

473 

3537 

234 

5731 

464 

4918 

406 

3919. 

310 

49380 

3702 

Q 

7.50% 

433 

473 

234 

464 

406 

310 

3702 

84 


TOTAL 

TOTAL 

ERROR%  ERROR% 

109128 

222306 

3.10% 

1.20% 

3388 

2658 

4467 

1731 

AVG 

2.15% 

3260 

2569 

89386 

96409 

12.41% 

8.26% 

11092 

7961 

16444 

11673 

AVG 

10.33% 

8564 

5478 

47257 

43211 

4.64% 

9.50% 

2194 

4106 

2377 

2581 

AVG 

7 . 07% 

1018 

4038 

10298 

10298 

5.15% 

5.33% 

530 

549 

304 

337 

AVG 

5.24% 

399 

508 

329027 

390517 

13.61% 

22.32% 

44792 

87158 

20999 

28838 

AVG 

17.97% 

43439 

86941 

32959 

49380 

9.41% 

7.50% 

3101 

3702 

567 

0 

AVG 

8.45% 

3101 

3702 
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APPENDIX  E.  ERROR  TRACKING  SHEET 

USE  THIS  SHEET  TO  RECORD  THE  STATUS  OF  ERRORS  YOU  RESEARCH. 

1.  TRANSACTION  -  Record  the  three  character  transaction  code. 

2.  Check  the  block  which  refers  to  the  transaction  status. 

•  CORRECT  AS  IS  -  Implies  that  even  thought  the  transaction  was  rejected  from 
the  update,  it  is  correct  as  submitted  and  will  simply  be  resubmitted. 

•  CORRECTED  REINPUT  -  Implies  that  there  was  some  correction  which  had 
to  be  made  to  the  transaction  as  submitted,  before  it  could  be  rcinput. 

•  INCORRECT  NOT  REINPUT  -  Implies  the  transaction  was  in  error  and  will 
not  be  submitted  to  the  update. 

3.  DAYS  BETWEEN  UPDATES  -  Record  the  number  of  days  from  the  update  the 
transaction  was  first  submitted  until  the  dav  vou  resubmit.  (Example  TC333  - 
TC330  =  3) 


TRANSACTION 

COR- 

COR- 

IN  COR- 

DAYS  BE- 

RECT 

RECTED 

RECT 

TWEEN  UP- 

RE  INPUT 

REINPUT 

NOT 

REINPUT 

DATES 

APPENDIX  F.  ERROR  PROBABILITIES  AND  CORRECTION  TIMES 


FINDING  THE  ERROR  PROBABILITIES 


TAC  RE  INPUT  CORRECT'D  NOT  INPUT  P  P' 


A68 

1 

1.00 

0.00 

QCO 

14 

19 

81 

0.88 

0.12 

300 

3 

1 

1.00 

0.00 

301 

2 

4 

2 

0.75 

0.25 

328 

3 

1 

2 

0.50 

0.50 

340 

1 

16 

7 

0.96 

0.04 

DAYS  UNTIL  REINPUT 

TAC  A68  QCO  300  301  328  340 

DAYS  2  9  2  2 

1  9  2 

2  1 
3 

2 

1 

3 

2 

2 

2 

2 

2 

2 

1 

NO  DATA  1.9  NO  DATA  9.0  1.7  2.0 
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DAYS 

TAC 


UNTIL  CORRECTED/DELETED 


A68 

QCO 

300 

301 

328 

2 

1 

2 

2 

28 

2 

15 

1 

28 

1 

27 

2 

I 

27 

1 

1 

9 

1 

2 

3 

1 

2 

2 

1 

1 

1 

1 

2 

1 

1 

1 

1 

2 

1 

2 

2 

1 

1 

2 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

2 

1 

1 

1 

1 

2 


340 


88 
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APPENDIX  G.  INTRINSIC  TRANSACTION/STORED  MIS  ERROR 

RATES 


FINDING  THE  STORED  MIS  ERROR  RATE 


VARIABLE 

r 

P 

P' 

ERROR  RATE 

e(T) 

TRANSACTION 

A68 

0.02 

1.00 

0.00 

2.15% 

QCO 

0.10 

0.88 

0.12 

0.00% 

NOTE  1 

300 

0.07 

1.00 

0.00 

7.07% 

301 

0.05 

0.75 

0.25 

0.00% 

NOTE  1 

328 

0.18 

0.50 

0.50 

0.00% 

NOTE  2 

340 

0.08 

0.96 

0.04 

4.68% 

NOTE  1:  0  because  r  is  less  than  P' 

NOTE  2:  P  equal  P'  so  no  valid  value  can  be  found 


VARIABLE 

Cl 

C2 

C3 

TRANSACTION 

A68 

1.5 

2.0 

NO 

DATA 

QCO 

1.5 

1.2 

1.9 

300 

1.5 

17.8 

NO 

DATA. 

301 

1.5 

3.0 

9.0 

328 

1.5 

28.0 

1.7 

340 

1.5 

10.8 

2.0 

90 


VARIABLE 

u(T) 

MIS  ERROR  RATE  e(M) 

TRANSACTION 

A68 

1460 

0.11% 

MISSING  DATA 

QCO 

1080 

0.16% 

300 

1460 

0.19% 

MISSING  DATA 

301 

5475 

0.07% 

328 

1460 

0.16% 

MISSING  DATA 

340 

9000 

0.02% 
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